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ABSTRACT 

We use Bayesian estimation on direct T-Q-U CMB polarization maps to forecast errors on the tensor- 
to-scalar power ratio r, and hence on primordial gravitational waves, as a function of sky coverage / s ky ■ 
This T-Q-U matrix likelihood filters the quadratic pixel-pixel space into the optimal combinations 
needed for r detection for cut skies, providing enhanced information over a first-step linear separation 
into a combination of E, B and mixed modes, and ignoring the latter. With current computational 
power and for typical resolutions appropriate for r detection, the large matrix inversions required 
are accurate and fast. Our simulations explore two classes of experiments, with differing bolometric 
detector numbers, sensitivities and observational strategies. One is motivated by a long duration 
balloon experiment like Spider, with pixel noise oc fsky for a specified observing period. This 
analysis also applies to ground-based array experiments. We find that, in the absence of systematic 
effects and foregrounds, an experiment with Spider-like noise concentrating on / s k y ~ 0.02-0.2 could 
place a 2oy « 0.014 bound (~ 95% CL), which rises to 0.02 with an ^-dependent foreground residual 
left over from an assumed efficient component separation. We contrast this with a Planck-like fixed 
instrumental noise as / s k y varies, which gives a Galaxy-masked (/ s ky = 0.75) 2a r ~ 0.015, rising to 
« 0.05 with the foreground residuals. Using for a figure of merit the (marginalized) ID Shannon 
entropy of r, taken relative to the first 2003 WMAP1 CMB-only constraint, gives —1.7 bits from 
the 2010 WMAP7+ACT data, -1.9 bits from the 2011 WMAP7+SPT data, and forecasts of -6 bits 
from Spider (plus Planck); this compares with up to -11 bits for a CMBPol, COrE and PIXIE post- 
Planck satellites and -13 bits for a perfectly noiseless cosmic variance limited experiment. We thus 
confirm the wisdom of the current strategy for r detection of deeply probed patches covering the / s k y 
minimum-error trough with balloon and ground experiments. 

Subject headings: Cosmic background radiation - Cosmological parameters - Cosmology: theory - 
Methods: numerical 



1. INTRODUCTION 

Inflation, a period of accelerated expansion in the very 
early universe, is the most widely accepted scenario to 
solve the problems of the otherwise successful standard 
model of cosmology. In the simplest models the ex- 
pansion is driven by an effective potential energy of a 
single scalar field degree of freedom, the inflaton. An 
unavoidable consequence is the quantum generation of 
scalar and tensor zero-point fluctuations in the space- 
time metric. The former are 3-curvature perturbations, 
with associated density fluctuations that can grow via 
gravitational instability to create the cosmic web, with 
its rich observational characterization. The latter are 
gravity waves that induce potentially observable signa- 
tures in the spatial structure of the Cosmic Microwave 
Background (CMB), in particular in its polarization, the 
focus of this paper. Whereas curl-free -E-modes of po- 
larization can be produced both by tensor and scalar 
perturbations, divergence-free modes of CMB polariza- 
tion (B-modes) would be induced on large scales by pri- 
mordial gravitational waves but not by scalar curvature 
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fluctuations. Many experiments are in quest of this in- 
flation signature, but the predicted signal, if detectable, 
is very small and subject to contamination by leakages 
from the total anisotropy T and from the dominant E 
polarization, as well as by other systematic effects, so 
extraordinary care is needed to analyze such data. At 
smaller scales, B modes are induced from primordial E 
modes through gravitational lensing distortions of the 
CMB polarization patterns, adding to the complexity of 
making a clean separation of the tensor-induced signal. 

The primordial scalar and tensor power spectra (fluc- 
tuation variances per lnfc) and their ratio r(k) are of- 
ten approximated by power laws in the 3D comoving 
wavenumber k, 

\n s {k sp ) — l 



T'sik) ~ ^4s(fcsp) (k/k sP/ 

v t (k)^A t (k tp )(k/k tp y it{k ^. 

r{k)=V t (k)/V s (k)^r (k/ktp] 
r = r(k tp ) = Tt(k tp )/T s {k tp ) . 



it(/c tp )— n B (fc tp ) + l 



The scalar and tensor pivots fc sp and kt p about which 
the expansions occur are usually chosen to be different 
for scalars and tensors to reflect where the optimal sig- 
nal weights come from. The main target of many of the 
current and coming CMB polarization experiments is, 
firstly, a one-parameter uniform r. An advantage of this 
ratio over Vt{kt p ) is that it removes a dominant near- 
degeneracy with the Thompson depth to Compton scat- 
tering r. The spectrum r{k) also measures the inflation 
acceleration history e(a), and can be directly related to 
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the inflaton potential energy through this relation: 

r(k) ps 16e(aps fc/F), e= -dlnif/dlna, 

V" ps rM£V a 3/2(l - r/48) - (10 16 Gcv) 4 r/0.1 . (1) 



Here Mp = 1/vottG is the reduced Planck mass, with 
c and h set to unity. The relation k ss Ha, of reso- 
lution k~ x to the dynamics encoded in the expansion 
and Hubble parameters, a an d H, is only a pproximate of 
course, but very useful, e.g., iBondl (|1996l ). A detection 
of r ~ 0.03 — 0.2 would provide a strong pointer to the 
specific inflation model. A tight upper bound, r < 0.03, 
would rule out a very large class of inflation scenarios, 
a bound that is achievable with the experiments we ex- 
plore here. In this paper, we often use r^d = 0.12 as 
a fiducial high-r case for tests, since it is near the 0.13 



coming from the simplest V 



1 /2 chaotic inflation 



model, and has an inflation energy scale y 1 / 4 near 10 16 
Gev. We also explore the very small r^ < 0.01 regime. 

We would like to learn as much as we can about the 
full r(k), hence e(a), from CMB data. In addition to the 
deviations of the slopes from scale invariance (n t = and 
n s — 1 = 0), the slopes are expected to "run with k" just 
as the power does, although they may be approximately 
constant over the observable CMB range. The first or- 
der variations in Ink define scalar and tensor "running " 
parameters, the first terms in polynomial expansions in 
higher order "running of running" parameters. In this 
paper n s (k) is not our target, nor are high multipole 
CMB experiments which are necessary to get the long 
baseline needed to show whether n s runs or not. 

A consequence of the fall-off of the tensor-induced 
CMB signal beyond £ ~ 150 is that only limited in- 
formation can be obtained on nt(k) — enough to allow 
a number of broad bands for r(k), but not enough for 
nt{kt v ), let alone nt(k), to be determined with sufficient 
accuracy to test well the inflation consistency relation for 
gravity waves. In the limited 2-parameter tensor param- 
eter spa ce of r and u niform nt , this consistency condition 
is 



nt w -r/8/(l - r/16), 



(2) 



so a convincing test would require an order of magnitude 
better determination of nt than r. Another complication 
in relating the experiments to inflation theory is that 
there is still observational room for subdominant scalar 
isocurvature perturbations in addition to the dominant 
curvature ones when multiple fields arc dynamically im- 
portant during or immediately after inflation; such fields 
are widely invoked for catalyzing the production of en- 
tropy at the end of inflation. Isocurvature perturbations 
with a nearly scale invariant primordial spectrum have 
significantly enhan ced low-l CM B power because of the 
isocurvature effect [Bond (1996), and that region, over- 
lapping with the gravity wave induced CMB power, is 
where the constr aint on the overall isocurvature ampli- 
tude comes from (|Sievers et al.ll2007j) . 

All CMB polarization experiments are limited in sky 
coverage by instrumental or Galactic foreground con- 
straints. Thus, even though the B modes provide a 
unique r-signature and arc orthogonal to the E modes 
over the full sky, realistically mode-mixing must al- 
ways be dealt with, even though it may be larger for 
smaller / s k y . Assessing the trade offs between shal- 



low large-sky and deep small-sky observational strate- 
gies is the target of our investigation. Going for deep 
and small has the advantage that one can select the 
most foreground-free patches to target to decrease the 
high level of foreground subtraction. As well, the 
long waves which dominate foregrounds are naturally 
filtered. Ground-based or balloon-borne experiments 
using the deep and small-sky strategy are: BICEP 
and BI CEP , QUIETS, PolarBeaiB, EBE Xgl, SpideiH, 
KEC K dSheehv et alJI201lh . ABSE3, PIPER (IChuss et alJ 
I2010D . Planck (and WMAP) are (relatively) shallow 
and large-sky. Propose d next-generation satellite exper - 
iments such as COrE (|The COrE Collaboration! 120 111) . 
PIXIE ()Kogut et al.|[20TTh and LiteBIRD E3 are deep and 
large-sky. 

In this paper, we first review the general Bayesian 
framework for determining parameters to introduce the 
notations we use. We cast the quest for r into an 
information-theoretic language in which the forecasted 
outcomes of different experiments can be contrasted by 
considering the differences in their reduced a posteriori 
Shannon entropies for r, Sif (r|expt). We discuss the two 
basic approaches for constraining cosmological observ- 
ables, such as those associated with inflation, and the re- 
lation of these to E-B mixing: (1) the f-space approach 
in which CMB maps are first compressed onto power 
spectrum parameters for TT-TE-EE and BB, which are 
then compressed onto cosmic parameters; and (2) direct 
parameter extraction of r from map likelihoods. Our 
primary target is r and not the B-mode spectrum, hence 
the optimal one-step estimation from maps is preferred, 
provided it is computationally feasible - which it is for 
Spider-like experiments. The leakage between the E and 
B modes and its impact on r is quantified in § [3] In 
§ 2] we present details of the method we use to bypass 
explicit E-B de-mixing and apply it to simulated data 
for realistic instrumental and foreground-residual noise 
levels for Spider-like and Planck-like experiments as / s k y 
varies. We end with our conclusions from this study. 

2. BAYESIAN CMB ANALYSIS OF MAPS, BANDPOWERS 
AND COSMIC PARAMETERS 

As has become conventional in CMB analysis, the 
framework envisaged to reduce the information from 
Spider-like raw time ordered data to constraints on cos- 
mic parameters, in particular our target r, is on e of a 
long Bayesian chain of conditi onal probabilities (IBondl 
H99alBond fc Crittender][200l . To introduce our nota- 
tion, we review that framework with polarization. We 
also remark on how the associated conditional Shannon 
entropies decrease as we flow along the Bayesian chain, 
a novel way of looking at what is being done as the data 
is reduced to a precious set of parameter bits. 

2.1. Reducing Noisy Data with Bayesian Chains 
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2.1.1. The Information Action in Bayesian Chains 

In Bayesian analysis, we wish to construct the a 
posteriori probability distribution of parameters q = 
(qi, qjv), P(q|2?,T), an update from the a priori prob- 
ability P(q|T) on the theory space T of the parameters 
that is driven by the likelihood £(q|X>, T) = P(£>|q, T) 
of the data T> given q: 

P(q\V,T) = P(V\q,T)P(q\T)/P(V\T) 

The prior may include theoretical prejudice, information 
derived from other data, and, at the very least, the spe- 
cific measure adopted for the parameters. The evidence, 
P(T>\T), a single normalization, is also needed to ensure 
the posterior integrates to unity. Its determination is 
generally computationally-intense if one integrates over 
all parameter space, but it may only be needed at late 
stages of reduction, e.g. over 2D and ID reduced param- 
eter spaces. 

We can insert various further conditional probabili- 
ties on the path to the confidence limits on r from the 
fully reduced data. Examples in the transition from 
multichannel timestreams are: to multifrequency maps; 
to component-separated maps; to bandpowcrs of cosmic 
spectra; to cosmic and nuisance parameters; to r. It is 
feasible to skip over the reduction-to-bandpowers step 
for Spider-like experiments because the number of pixels 
required will allow us to do a direct leap from the maps. 

We express the Bayesian chain for the posterior in 
terms of an information action 5i(q), an energy- like (in 
temperature units) Euclidean action function that in- 
cludes the likelihood and the prior: 

P(q|P,T) = e- lnP(I,|r) e- 5 ^, 
Si = -lnP(2?|q,T)-lnP(q|T), 

P{V\T) = J d N qe- Sl ^ . (3) 

The more elements there are in the chain, the more addi- 
tive contributions there are to this energy. The evidence 
enters like the partition function in statistical mechanics, 
and its log is the (negative of) a free energy (in dimen- 
sionless units). 

2.1.2. Reduction to Maps and Other Matched Filterings 

Initially T> is in the form of time-ordered informa- 
tion, Tol's, containing the time-ordered data, and, typi- 
cally, many flags about the data quality. The first step 
in the chain is to create maps from these, with q be- 
ing the map data vector A, with components A cxp la- 
belled by frequency channel c, Stokes polarization index 
x = T,Q,U,V and spatial pixel number p = 1, iVpi x . 
The Stokes parameters Q,U,V are referred to a fixed 
polarization sky reference frame in real space. (Most 
experiments do not have simultaneous T,Q,U and V de- 
tectors.) 

The solution of the parameter estimation problem in 
this case is a set of (generalized) pixel means A, and 
a noise covariance matrix Cn = (8 A<5 A^) , in terms of 
the noise vector <5A = A — A. Henceforth, we do not 
use bold letters for the matrices, which are the most 
often used entities. The way one does this is to solve 
d = d op (q) + n, with the operator d op (q) = ipq, a linear 
data model with amplitudes q and templates p. Here d 



represents the time ordered data. The templates form 
an N t x iVpi x matrix, where N t and iVpi x are the total 
number of digitized time observations and pixels (from 
all maps) respectively. This is a large compression of 
the data, by of order N t /N piK , done by projecting out 
elements of the time-streams that are incompatible with 
the templates p. (That projected-out information is a 
fertile residual space for searching for the signals of rele- 
vance for experimental systematics studies.) 

Making maps in this way is just one example of 
matched-filter processing of linear data models. The 
main ingredient is an optimal filter ip constructed from 
the linear templates if with weight w n f. 

d = pq + n, n = d — (p(q\d) l 

(q\d)i = i>{d- n) +w- 1 w qi q i , ?/; = w~V t w„ f , (4) 
Wg{ = tp^Wnfif + w qi , 

8q = q- (q\d) { , 

8n = n - n, n = (n) f , qi = 

w nf = {8n6rJ) { , w"/ = (SqSq^f, = (8q8q* l ) i .(5) 

The residual n is a "generalized noise" that is unac- 
counted for in the ipq template representation of the data 
vector. We have allowed for a non-zero mean (n) i = h 
of the residual (e.g., it does not vanish in the cosmic 
parameter estimation of § 13. ip . The weight w n { is opti- 
mally related to the correlation in the noise fluctuations, 
eq. in the sense that it minimizes the final correlation 
matrix of parameter errors . Other weight choices 
than this optimized w n f can work well, at the expense of 
enhanced errors on the q estimators. We have added an 
initial signal weight w q i, which is updated by the data 
to w q f . A utqi is necessary if the dimension of the signal 
space exceeds that of the data space. If this is not the 
case, we often operate in the w q i — > limit. 

If the residual variance 8n8n^ is determined from the 
data d itself, it would not be invcrtiblc. The estimation 
of w n f requires an assumption to regularize the inversion, 
e.g., the raw variance is smoothed. The prior on the form 
of w n { may turn it into an extra parameter estimation 
problem. The compression of the Nt x N t information in 
the matrix 8n8n^ onto the parametric form can regularize 

Wnf- 

The derivation of eq. [4] is most easily seen if both the 
data noise n and the signal prior for q are Gaussian: 

Sj = ±(n- n^wj (n - fi) + %N d Ih(2tt) + ^Tr In wj 
+ \{q- q^w^{q - $) + \N q ln(2^) + ^Trln^ 1 , 

with d = ipq + 7i. Manipulating this gives the usual: 

Si = Si q + Sid , 

Si q = \8jwqi8q + \N q 1h(2tt) + iTrlnu." 1 , 
5m = \d^C^d + iA^ln(2^) + iTrlnC t , 
C t = vj- 1 + <^ V . 

From Siq, Wiener-filtered linear signals (q\d, q~i} { (eq|4]) 

— 1/2 

and the fluctuations about them, w ql r/, with r\ an N q 
vector of Gaussian random deviates, are obtained. Either 
this method, or approximations to it, is the preferred one 
for E and B construction. The first such separated po- 
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larization component map s derived from data were pre- 
sented in the CBI papers (jReeves et all [20061 ) . of course 
with the BB a non-detection consistent with the noise. 
There has been much discussion about us ing variants 
of this appro ach for E-B separa t ion (e . g. iLewis et aH 
(|200l : iBuniil (12001 : iBunn et all (I20M ): IBunnl ^OlUr T 
From Sid, the statistics of the cosmic (and other) pa- 
rameters the w q i in the prior depends upon are derived. 
This is our main focus here. 

2.1.3. From Pixel Maps to E and B Maps 

The map data vector A is composed of a number of 
signals s as well as the map noise n. The noise encom- 
passes true instrumental noise, experimental systematic 
effects, and possibly, may draw terms from the signal 
side that arc unwanted residuals on the sky, e.g., from 
foreground subtraction uncertainties. Each signal has a 
frequency dependence and polarization components, la- 
belled by the Stokes parameter index x. The map com- 
ponents are generally considered to be linear in the sky 
signals, 

e{T,Q,U,V}, 



Sjcxp 



E / * 



' cxp , J T vxlm^ J 'vxlm 1 



(6) 



where the spherical harmonic signal amplitude for signal 
J is ajvxlm- The transformation from this natural multi- 
pole space for the signals to the map space is encoded in 
the filters T C xp,Jvximi which includes beam and pixeliza- 
tion information, the frequency response function for the 
channels, and a mask fJ, p e m - The mask fi could be a sharp 
cookie cutter or be more gently tapered. 

The ajuxim are the coefficients in the standard expan- 
sion of the CMB temperature and polarization fields in 
orthogonal mode functions, which are the spherical har- 
monics, spin-0 for T and spin-2 for polarization, with 
further linear combinations of the spin-2 expansion coef- 
ficients defining the E and B modes: 



T 



E 



E 



ajvTe m Yem{9, ' 



{Q±iU) Jv (6,4>) = ±2dj v i m [±2Yi 

1 



O-JvElm 



1=2 m=-l 
-Aiajvlm + -20-Jvlm) 



1 



ajvBlm — —-Z-.iz&Jvim ~ -20- J vim) ■ 

The separation of the polarization into E and -B-modcs 
is useful because scalar perturbations only result in 
the E mode whereas the t e nsor p erturbations generate 
both dKamionkowski et al.l (|1997f ). IZaldarriaga k, Seliakl 
(|1997t) ). Nonlinear transport effects associated with the 
weak lensing of the primary CMB fluctuations turn some 
scalar E mode into scalar B mode, mostly at higher £s 
than the tensor component gives, so separation for r de- 
tection can be done. Note that this lensing source has 
non-Gaussian features which means the power spectra 
are not enough to characterize that signal. 

For Thompson scattering anisotropies, the V Stokes 
parameter associated with circular polarization vanishes, 



as it also does for most Galactic foregrounds contaminat- 
ing the primary CMB signal, so we now drop it from our 
consideration. It would of course be of interest to show 
experimentally that there is indeed no circular polariza- 
tion in the CMB data. 

As we have noted above, eqs, (UJ) and (O can be ap- 
plied to the case in which the d are the maps A cxp , 
the templates tp are the E and B mode function ro- 
tators and the parameters q are the E and B ampli- 
tudes in £m space. Since these compressed maps q and 
their variances contain complete statistical information 
for a Gaussian model, the g-power can be estimated from 
(q\d}(q\d} <f + (SqSq^). This is not the optimal det ermina- 
tion of power. We adopt the CBIpol approach (jSieversI 
|2004() . that while such optimal E, B separation is good 
for checking robustness of results and for visualization of 
the polarization signals, the path to parameters (includ- 
ing bandpowers) is through the quadratic matrix meth- 
ods, the mLikely approach of CBIpol, using eq|U 

2.1.4. Maps to Parameters with Matrix-based Likelihoods 

For statistically isotropic signals there are generally six 
cross-spectra among the coefficients, 

{axtma, x >t> m i) = Cxl&U'$mm' , X = xx' , 

for x e {T, E, B}, X e {TT, EE, BB, TE, TB, EB} . 

Typically the EB and TB power vanish (theoretically 
anyway) and only four are needed. However, EB and 
TB may be kept for systematics monitoring. For sta- 
tistically homogeneous and isotropic 3D Gaussian initial 
conditions, the primary CMB T,Q,U are isotropic 2D 
Gaussian fields whose probability distribution depends 
only upon the power spectra Cxi, or, cquivalently the 
X-power per ln(£ + 1/2), 

_ i(£ + l) ^ 

l-Xl = n ^Xl ■ 

Z7T 

If there is no correlation between signal and noise, the 
components of the total covariance matrix Ct t cxpc'x'p' are 
given by the sum 

Ct = Cn + ^ ' Cg ) JJi , C~N ,cxpc' x' p' — ( n cxp n c'x'p') , 
J J' 

Cs,Jcxp,J'c'x'p' — (sjcxpSj'c'x'p'/ ■ 

The goal of bandpower estimation is to radically- 
compress the map information onto f-band power ampli- 
tudes the q x ° , with templates ip of form Cxp,xe- With 
sufficiently fine ^-space banding, this stage of compres- 
sion can be relatively lossless, allowing the cosmic pa- 
rameters to be derived accurately. The inter-band shape 
of these templates may be crafted to look like theoret- 
ically expected shapes, or could just be flat, which im- 
poses no prior prejudice. Both approaches have been 
effectively used. Usually the /3-shapes have been chosen 
to be sharply truncated with no overlap in ^-space, but 
this is not at all necessary. 

With cut-sky maps, bands are coupled even though 
they would not be for full sky observations with sta- 
tistically homogeneous noise. The optimal method for 
estimating power spectra in the general case is the com- 
putationally expensiv e brute-force maxim um likelihood 
(MLE) analysis (e.g. iBond et all (|1998ft ). which itera- 
tively corrects a quadratic expression for deviations Sq^ 
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of the various bandpowers from their initial values 
until the maximum likelihood q^ is reached. The weight 
matrix C t ~ (q) is adjusted at each step, until it settles 
into C t ~ 1 (q m ). The weight enters in two ways, one is 
quadratically in the likelihood-curvature matrix (approx- 
imately the Fisher matrix) and the other is in the force 
that drives the relaxation of the parameters to q^. It 
turns out that one can think of the quadratic expression 
as describing the action of a matched filter on the pixel- 
pixel pair data product, similar to the way linear filters 
acting on the data vector may be matched, as we show 
in §0 

Matrix method s for bandpower estimation were used 
by B oomerang (|de Bernardis et all 120001 : iRuhl et all 
|2003| ) and in all CBI papers. If the cosmic parameter 
of interest is linear, like r, then it can be viewed as a sin- 
gle template big-band bandpower. Even with the fully 
nonlinear Cxi(q), the amplitudes Sq can be iteratively 
solved for using linear derivative templates, and, with 
convergence, the result is the same as a full nonlinear 
treatment gives. 

2.1.5. Pseudo-Cxi cf. Map-Matrix Methods 

Several fast sub-optimal approximate methods have 
been developed to make the bandpower compu- 
tations less computationally intense than in the 
map-matrix method: e.g., pseud o -C/> estimators 
Hansen fc GoTskll [2001 IChon et all I2001. SPICE 



Sza oudi et all 120011). M ASTER dHivon et all M)0 



and Xfaster (jcontaldi et all 120101: iRocha et all 1201 
Pscudo-C/s are constructed by direct spherical harmonic 
transform of the cut-sky maps, or more generally, taper- 
weighted CMB maps. The all-sky bandpower centred 
on a specific ip, q @, is then related by an appropri- 
ate filtering which draws the pseudo-Cx/s from a wide 
swath of I 's determined by a mask-defined coupling ma- 
trix into the desired ^-band. In spite of this £-space 
mixing, extensive testing has shown these methods to 
be accurate for temperature anisotropics for large pixel 
numbers where the matrix inversions of the iterated 
quadratic approach are prohibitively expensive compu- 
tationally. They have also been applied e ffectively to po- 
larization datasets such as Boomerang (jMontrov et al.l 
[200l iPiacentini et al.|[200l) . 

The pseudo-Cxf for X = EE, BB suffer from E-B 
mixing in addition to the £-space mixing: the estimated 
Cbbi receives contributions from both E and B-modes. 
The contamination coming from the E-mode can be re- 
moved from Cbbi m the mean by having the estima- 
tors undergo a de-biasing step. However, there is still an 
extra contribution to the variance of estimators which 
is due to the dominance of the relatively large E sig- 
nal mixed into the B measurement. This can limit the 
primordial gravitational wave detection to r k 0.05 for 
deep small sk y surveys (covering about 1% of the sky) 
as sho wn by (jChallinor fe Chonl (|2005h ). iLewis et al.l 
<|2002f ) show how to construct window functions that 
cleanly separate the E and B modes in harmonic space 
for azimuthally symmetric sky observations at the cost 
of some information loss due to the boundary of the 
patc h. In another tre atment of the E-B mixing prob- 
lem, n unn et all (|2003| ) show that the polarization maps 
can be optimally decomposed into three orthogonal com- 



ponents: pure E, pure B, and ambiguous modes. The 
ambiguous modes receive a non-restorable contribution 
from both E and B signals, and arc dominated by E sig- 
nal, thus should be removed in B-mode analysis. Based 
on this decomposition, a near-optima l pure pscudo-C^ 
estimator was propos e d (iSmithl (12006ft) and de veloped 
(jSmith fc Zaldarriaeal (|2007t ). iGrain et all $200$) ) which 
ensures no E-B mixing. Recently |Bunn| fl201lD has given 
a more efficient recipe for decomposing polarization data 
into E, B and ambiguous maps, although still along the 
lines of lBunn et all (|2003| ). 

It is clear that if the full map-likelihood analysis can be 
done, then it should be done, since relevant information 
is not being thrown away. There are two drawbacks to 
the map-based approach. The first is that Ct should sat- 
urate all contributions to signal and noise since we are in 
quest of a small, essentially pcrturbative, component as- 
sociated with r whose values can be biased by the missing 
components. This could be challenging in the presence of 
complex filtering resulting from time-ordered data pro- 
cessing. Also the computational cost of the required large 
matrix manipulations is high compared to the subopti- 
mal methods. The matrix size depends upon the fraction 
of sky covered and the resolution. For example, for an ex- 
periment covering 25% of the sky analyzed at a Healpix 
resolution of A^idc = 64, the sizes are 35K x 35K and 
we find the likelihood calculation takes about 5 minutes 
on a node with 16 Dual-Core Power 6 CPU's at 4.7 GHz 
(and theoretically capable of doing 600 GFLOPS/nodc). 
In practice, our matrices are smaller than this since the 
quest for r requires a relatively low resolution analysis 
and only a few other parameters that are correlated with 
it need to be carried along, as we show here. To include 
many more parameters standard Bayesian sampling al- 
gorith ms such as MCMC a nd adaptive importance sam- 
pling (jWraith et al.l (f2009h ) can be used. If we need to 
cover small angular scales as well as large, the matrices 
become prohibitively large, and hybrid methods, with a 
map-based likelihood for large scales joined to an ^-space- 
based likelihood for small scales, are needed. 

2.2. The Downward Flow of Shannon Entropy from 
Data Compression onto Theory Subspaces 

The Shannon entropy St of the final (posterior) prob- 
ability distribution is an average of the log of the local 
phase space volume (lnp^ 1 ) f over the posterior proba- 
bility distribution pi, and is considered to provide an 
estimate of the t otal information content in the final en- 
semble (see, e.g., IMacKavl (pOOl 'l: 



= - / d N qp { lnp f = (lnP(q|I?,r)- 1 ) f 
-\nP(V\T) 



Sf(T\V) 

= (5i) f + lnP(V\T) 

(Sj) { = J d N qe - s ^s 1 / J 



d N qe- s \ 



The initial entropy is averaged over the initial ensem- 
ble: Si = (In P(q|T) _1 )j. For a uniform prior over a 
volume Vq i in q-space, it is Si = lnV^j. The final en- 
tropy can be thought of as having a contribution from 
(the log of) an effective phase space volume, reduced 
relative to the initial one because of the measurement, 
plus a term related to the average x 2 associated with the 
mean-squared-deviations of q, usually just the number 
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of degrees of freedom unless the model is a very poor 
representation of the information content of the data. 

It should not seem curious to say that the information 
entropy decreases as a result of measurements, but it 
may seem curious to word it as: the average information 
content decreases. That is because the fully random ini- 
tial state has more information, in that the variables can 
take on a wider range of values. We think the reduced 
post-experiment information content is of higher quality. 
What constitutes Quality in information is subjective of 
course. 

Consider the initial space of the T>, the space of full 
time-ordered-information, replete with bolometer read- 
outs, flagged glitches, housekeeping information, etc. 
The amount of information we begin with is therefore 
enormous. From this data optimal maps arc constructed 
with map parameters q having channel and Stokes as well 
as pixel indices (the A cxp defined above) which defines 
the T-space for this leg of compression. As the iterations 
progress towards the maximum likelihood map, there is a 
mismatch between the noise power spectrum w~l on the 
prior iteration, and the noise variance on the posterior 
iteration: the latter will be less, hence so will its loga- 
rithm, hence so will the Shannon information, until it 
settles into its final converged value, the information en- 
tropy in the maps, 5/ (maps). Thus, S(maps) decreases 
substantially from the large available information in the 
uninformed prior, but also decreases as the iterations 
converge, settling on 5/ (maps) = Nd/2 + Nd ln(27r)/2 + 
Tr(lnC]\j)/2. The new dimension for the reduced data 
is the number Nz> of generalized pixels: it is the total 
number of pixels from all channels times 3 for T, Q, U. 
The No x Nd noise matrix is Cn = (jp*w n i f)" 1 ■ The 
information per generalized pixel is not so large but there 
are lots of such pixels. 

The standard noise assumption we make for our maps 
is that it is homogeneous and white, usually different 
for T and Q,U. Given a total integrated noise power, 
iVpi x CTp ix , the map entropy is maximum if the noise is 

white with the same (ip ix for each pixel. We have included 
modest (yet realistic) inhomogencity in the noise as well 
to test sensitivity to this assumption, but find that makes 
little difference to our results. The noise in a pixel of area 
Api X from observations covering an area 47r/ s k y over an 
observing time T obs is <jp ix oc 47r/ sky /(Api x r obs ). In that 
case, the entropy is S (maps) /pixel = [1 + ln(27rt7p ix )]/2. 
With a fixed T Q b s and pixel size, the total entropy differ- 
ence is oc / s ky In / s ky times a large number, thus quite a 
bit higher for large regions, and not just because there 
are more pixels: it is higher per pixel. The entropy in 
the map is more constrained if we focus our available 
resources on smaller regions, but of course only if the re- 
gions are of a size and resolution to be of relevance for 
our target cosmological parameter, e.g., r. 

We can obviously use the maps rather than the Tol's 
as our starting point since, by design, no information 
relevant to estimation of our target r is lost in the com- 
pression. The pixel sizes are chosen so this is true. Most 
of the huge entropy store in the Tols is inaccessible to 
r. As we have discussed, the traditional approach is to 
further compress T>, but in T> <£> T> space (actually in the 
symmetric T> V T> space), by solving for bandpowers in 



the manner described above. The translation of the vari- 
ables is: d are now the map products AA^, tp is Cs, PP ' 
and q is the vector of (normalized) bandpowers. Because 
the bandpowcr likelihood surface pf (q) is quite complex, 
non-Gaussian and with band-to-band correlations, de- 
termining the information in the bandpowers requires a 
direct integration. As well, the A^and-bands are gen- 
eralized ones, indexed by channel number, polarization 
component (a number for TT, TE, EE, BB, TB, EB), as 
well as by ^-band number. With maximum likelihood re- 
laxations to the bandpowers and Fisher matrix determi- 
nation of errors (such as is used in XFaster), we would get 
S(band) = A band /2 + AWi ln(27r)/2 + Tr(ln F ba 1 nd )/2. 

An oft-used approximation to likelihood surfaces fully 
determines P(q^) for each band j3 with amplitude q@ , but 
treats band-band correlations in a weakly coupled Gaus- 
sian approximation. For example, Boomerang and CBI 
and other CMB likeliho od analyses used t he offset log- 
normal approximation of iBond et al.l ()1998| ): each P(q") 
was fit by a Gaussian in the variable z' 3 = ln(q^ + q^) 
which required an estimate of the noise in the band q^ 
as well as the observational mean cf , with a posterior of 
form 

- In P(q\V,T) = ±<Sz t .F 2 <5z+ ±Aln(27r) + ±Trln T~ x , 

in terms of the fluctuation Sz = z — z about the obser- 
vational z-average z' 9 = ln(q^ + q^)- The transformed 
correlation matrix is T~ x = (Szdz'}. For WMAP, a cor- 
rection to this treatment was used, and for Planck a much 
more accurate characterization of the likelihood surface 
is ne eded, and con t inues to be under active development 
(e.g.. lRocha et al.1 (|2010f )). For both, the likelihood is a 
hybrid, map-based for the low £'s, and bandpower-based 
(with A£ p = 1) for high f s. 

A fully-characterized bandpower likelihood surface can 
of course be used for r estimation provided it is lossless. 
If only a few bands /? are used, we can use intra-band 
template shapes with amplitudes rxp, which are approx- 
imately lossless for r; a 2-band calculation is shown in 
§ 14.71 Mostly we quote single-band results, the one-step 
leap to r from the maps, using full-matrix posteriors, as 
we explore how different expenditures of observational 
time for various experimental sensitivities lead to changes 
in the error. We primarily quote 2ay as our error figure 
of merit, determined as explained in § 12.31 

A better figure of merit than 2<j r is the change in ID 
Shannon entropy which tells us the average amount by 
which the log of the allowed volume in the r parameter 
space shrinks in response to varying the experimental se- 
tups. It is ID because we marginalize over all other N — 1 
parameters, the cosmic ones of interest and any nuisance 
parameters deemed necessary for the analysis, such as 
those characterizing uncertainties in calibration, beams, 
bolometer T-Q-U leakage, and foreground uncertainties. 
The final ID a posteriori probability pf (r\T>, T r )dr — 
(fi{ r o P — r)) { dr = exp[— <Su(r) — mP(T>, T)]dr involves 
a ID information action Su(r), the integration over all 
parameters except the operator r op whose value is con- 
strained to be fixed at r. Here T> refers to data, e.g., the 
maps, T refers to the overall theoretical framework, e.g., 
inflation-inspired tilted ACDM adiabatic with the usual 
basic six cosmic parameters plus r, and %. refers to T 
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with the r op = r constraint. 

The ID Shannon information entropy, Su(r) = 
(iSii(r)) f + ln.P(2?,T) , is best done by numerical in- 
tegration over the r-grid. The result is very simple if 
we truncate the ensemble-averaged expansion of Su(r) 
at quadratic order: 

Sn(r) w | + i ln(27r) + ln(a r ) = ±+lnV r , 

where V r (defined by the equation) is the compressed 
phase space volume for r after the measurements. 

Although we have used the natural log to make the 
entropy expressions familiar for physicists, in information 
theory one often uses the binary logarithm, lb = logi- 
With natural logs the information is in nats, but with 
lb it is in bits. When expressing information differences 
in § [5] we translate to bits. Since a full bit represents a 
factor of 2 improvement in the error bar, AS'if(r) may 
only be a fraction of a bit, trivial perhaps, but subtle too, 
given the mammoth information compression from raw 
data to this one targeted parameter degree of freedom. 

2.3. 2a Calculation 
We define 095 through : 

I £{r)dr = 0.954 / £(r)dr (7) 

J max(0,rb — CT95) 

where r^ is the best-fit value of r. The ergs-limit is de- 
termined by numerically integrating the Gaussian-fitted 
ID likelihood curve. 

In most cases considered in this paper the likelihood 
curves turn out to be well approximated by Gaussians. 
Therefore, when there is a few a detection (e.g. for 
r = 0.12) or when r ~ 0, to a very good approxima- 
tion (T95 = 2a where a is the width of the Gaussian fit. 
Thus, throughout this paper we will use the common 
notation of 2a which represents ergs and has been cal- 
culated through eq. [7J The only exception to this way 
of determining 2a is when it is being directly given by 
the inverse of the Fisher matrix, where a represents the 
width of the likelihood function, under the assumption 
of its Gaussianity. 

3. CONSTRAINED CORRELATIONS AND LINEAR 
RESPONSE IN PIXEL-PAIR AND PARAMETER SPACE 

3.1. Matched Filters in Quadratic Pixel-Pair Space and 
Maximum Likelihood Estimation 

When Ct (q) = Cn + Cg (q) depends in a nonlinear way 
on q, we can still explore the posterior space by a se- 
quence of linearized steps Sq a which converge to zero in 
the approach to the maximum likelihood; Ct* = Ct(q*) 
evaluated at the prior step g* can be thought of as the 
new general noise matrix and 5Cs(q) / 'dq a 5q a the new 
signal matrix in the linear model. The quadratic expres- 
sion determining the step is the action of a matched filter 
on the pixel-pixel pair data d = AA^, and has a form 
that can be unravelled from the general expression eq. |4l 
with a non-zero residual mean (n) = Ct*. If instead of 
the value at the last iteration we take q* = 0, we get 
the usual map noise Cn, and a generalized noise with 
some signal contribution to it if only some of the q* arc 
non-zero (e.g., foreground residual parameters). 

The signal coefficients qxp would be the isotropic 
power spectra bandpowers for the sets TT, EE, BB, 



TE, TB, EB. If the bands arc of width M = 1 con- 
sisting of a single multipolc, but all m, the ip are the 
filters for defining the pixel-pixel correlation matrices in 
terms of an £, m expansion of the total and polarization 
fluctuations, expressed in terms of the filters of § 12.1.31 

~ Y^nFcxpJvxlmFcx'p^vx'tm- 0r we Can choOSe just 

one t band with a template shape for each of the 6 X 
cases with 6 amplitudes rx multiplying these. An exam- 
ple of this approach is shown in § 14.71 Or we could choose 
just one set of shapes for all 6, with only one amplitude 
multiplier, q which we can normalize to be r. Template 
consistency is therefore assumed, and this gives the max- 
imum leverage for teasing out the best determination for 
r from the data, although it is of course heavily con- 
ditioned by the assumptions that go into the template 
construction (namely the values assumed for the other 
cosmological parameters which fix the structure of the 
templates). 

The pixel-pair residual fluctuation weight, W = w n f = 
(SnSn^) is, for Gaussian models of A, expressible as 
quadratic combinations of w = (n) = C^}: 

W(ij)(kl) = [wikWjl + WjkWu +WijW k i]/4. 
The inverse is 

related so that 

W(y)(fci) W (fcO(m™) = %)M = tfimSjn + S jm S in )/2. 

(We use the Einstein summation convention, that like 
indices are to be summed.) When we reorganize the tp^W 
projector on the right hand side and the ip^Wip inverse 
residual matrix on the left hand side, we obtaine the 
familiar Fisher expression for the parameter response Sq a 
driven by the pixel-pair deviation SCto = AA^ — Ct* of 
the raw observational correlation function Cto from its 
current estimate Ct*: 

i^vVl^to) - |Tr[C^ 1 dC t /d< ? a C t - 1 (AA+ - C t *)] 
- [^(AAt-Ct*)]^ 
F aP = \ r Tr[C^ 1 dC t /dq a C^dC t /dq fi ] 

= l^W^U ■ (8) 

This expression shows that the 8q a -adjustment is 
through a matched filter based on the templates Px™(ij) 

of form [dCt/dq Xa ]x'(ij)- The weighting in pixel-pair 
space shown is essential for it to be optimal. 

In § 13.21 and 13.31 we replace SCto by other pixel-pair 
deviations to show how the single tensor template-based 
bandpowcr, namely r, responds to individual E and B 
multipoles - i.e., a window function showing where the I 
power that r is sensitive to lies. With the noise and sky 
fraction embedded in the weights and in ip, these window 
functions vary from experimental setup to experimental 
setup. 

Equation [S] is an exact one following from a x 2 mini- 
mization of the linear expansion of Ct in Sq, albeit to be 
itcratively corrected. The more data-related path is to 
expand the information action associated with the pos- 
terior pi (q + Sq) about the starting point q* to second 
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Fig. 1. — The filter Wxtx'e' > ^\ X' C {EE, BB}, shows how the mode Cxi linearly responds to a small change in the mode Cx't' ■ The 
leakage response shown here is for an I' = 100 stimulus, for a Spider-like experiment with f s y_ y = 0.07 (at N s ^ e = 64) and f^y = 0.007 (at 
Aside = 128). Note the different j/-axis scales. 



order in Sq a (e.g., iBondl (|l~996l ); iBo^deFall (|l99l )). 
5i(q« + Sq) = 5i(q») - p Q (5<z" + ^F a pSq a 5q p , 



•Fa/j(q.) - ^(q.) = -^TrC^I^Cr* 1 



+TrC t ; 1 pC t - 1 Sc t -, 1 <5C t o 



to 



The fluctuation T — F of the curvature metric J-q,^ from 
the Fisher matrix F a p of eq. (j8]) has the two terms shown. 
Both are associated with the residual SCto mismatch 
since the parameter space correlations may not be able to 
fully saturate the data correlations. If the theory (includ- 
ing noise) is a good approximation to those components 
of SCto which survive the heavy matched-filtering, then 
these terms disappear with ensemble-averaging over all 



realizations. A caution is of course that we only inhabit 
a single realization. (The first subdominant second or- 
der term depends upon dCt/dq a dq^ , hence vanishes in a 
linear expansion model.) Each Sq a = [J 7-1 ]"' 3 ^ drives 
the system towards the p a (q* + Sq) = "equilibrium", 
but corrective steps are needed to fully relax to q^. In 
practice, using F a p in place of T a p is usually adequate, 
and indeed often preferred. In cases with structure-less 
likelihood functions, a few iterations usually suffice to 
take us as close to the peak as required. 

Since the entire statistics, given the validity of the 
Gaussian approximation for both signal and noise, is fully 
specified by the likelihood expression together with the 
prior probability defining the measure on the parameter 
space, no issue explicitly arises about mixing the SB- 
modes. The optimal quadratic filter to obtain the max- 
imum likelihood for r takes into account all aspects of 
the polarization. We can operate in the QU polarization 
space, with specific spatial axes chosen for the polar- 
ization basis vectors, or we can do a transformation to 
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Fig. 2. — Beam and pixel window functions for different resolu- 
tions are compared to the polarization power spectra for the best 
fit WMAP7-only parameters for the ACDM + lensing + SZ + ten- 
sor model, with the addition of a tensor component of strength 
r fld = 0.12. S-mode (GW) shows just the gravitational wave in- 
duced contribution and B-mode (GW+lens) includes the lensing 
contribution as well. 

spherical harmonic space and choose a polarization ba- 
sis which is explicitly tm dependent, as in the EB basis 
case. 

3.2. Linear Response of Cbbi to Ceei- Power Leakage 

In this section, we use quadratic matched-filters to 
quantify the leakage of CMB power among the differ- 
ent Cxe spectra. These are "susceptibilities", relating 
the linear response of a target variable to the stimu- 
lus of a driver variable. We also refer to them as win- 
dow functions to be consistent with the language used 
for bandpowers, in which the driver is the Cs y xe, and 
the response is the bandpower. The window function 
attached to each bandpower "gathers in ^-space" from 
a given Cg xi the bandpower. There is a long his- 
tory of making such windows publicly available. They 
were used in likelihood evaluations in the 2000 releas e 
of the Boomerang "B98" re s ults tiLange et al.l I2001D . 
iTegmark fc de Oliveira-Costal (|2001f) used similar win- 
dow functions in the quest for a best quadratic estimator. 

If we treat Cs,xe as our variable and replace C t ~ 1 (5Cto 
by its ensemble average, we have 



(SQ 



to; 



5C t = Y, C * 



,SC 



xe ■ 



(9) 



x. 



hence 



Sqa - T/r/« SCxe 
H x,e 



Cxi 



WZ^^ElF-r'Fpxt, 
q fi 



"0Xi 



{d 2 Sx/dq^dC xe ) 



It is isotropized over m. Another variant is W x which 
can tell us how uncertainty in q a is distributed over £- 
space. 

With the Cxi as the response parameters q a as well as 
the stimulating drivers, we have 



(SCxi\SCx'i 



Cxe 
Wxix'i' '- 



— =Y. Wxex 



SC 



X' 
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C X ' 
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We have verified numerically that a full sky observa- 
tion using the matrix methods gives uncorrelated modes 
Wxix'i' = See'Sxx' ■ Figure[T]shows the increase in mode 
correlation with decreasing / s k y for a fixed observation 
time. The observed patches are in the form of spheri- 
cal caps. We plot an I — 100 stimulus for / s k y = 0.07 
(at iV s ido = 64, pixel size w 56') and / s k y = 0.007 (at 
Aside = 128, pixel size « 28'). (Figure [2] shows the asso- 
ciated beam and pixel window functions along with the 
polarization power spectra.) 

Although the EE and BB responses are localized 
around the input i = 100, they are spread over I and 
leak into the other A-mode. By contrast, the cross- 
filters (Wbe,wo£ and Web,iooi) are not localized. Note 
that they are substantially smaller than Wb b, ioo l and 
Wee, ioo t- We also see that the relative contribution of 
the EE signal to the contamination of BB is about 3 or- 
ders of magnitude larger than the contamination in EE 
due to BB. We can conclude that EE power uncer- 
tainties from a large range of scales will affect the BB 
measurement. The width of the oscillation A£ ~ S~^ tch 
is related to the cap size, narrowing as / s k y goes up. The 
leakage is larger for smaller r, hence must be well charac- 
terized for highly sensitive i?-mode experiments to avoid 
a false detection. 

3.3. Linear Response of r to Cbbi and Ceei 

We now use these quadratic matched-filters to quantify 
the linear response of r (and other cosmological param- 
eters) to uncertainty in the Cxi-, 
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The filter for a Spider-like experiment with a fiducial r = 
0.12 is shown in Figure G2 as /^j, varies (as does the pixel 
size). The red, purple, blue and green curves correspond 
to / sky = 0.75, 0.25, 0.07 and 0.007, calculated at A sido = 
16, A s idc = 32, Aside = 64 and A s id e = 128 respectively. 
As expected, the figures show that the measured r is 
more sensitive to BB than to EE on most scales. 

4. SIMULATION METHODS AND CALCULATIONAL 
RESULTS 

In this section, we use the map-based TQU likelihood 
procedure of §[2]to compute the posterior P(q|/ S k y , I>, T) 
in parameter subspaces and, by marginalization, the ID 
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Fig. 3.— Window functions for X C {EE, BB} for different 
sky cuts show that, as expectea, all-sky experiments arc nicely 
sensitive to the reionization BB bump, but smaller sky experiments 
are not, although they pick up well the £ ~ 50—100 region. We have 
used r = 0.12 for the fiducial model. The rapid declines to high £ 
are more due to the onset of experimental noise than to the onset 
of the lensing-induced B "noise". Residual foreground noise has 
not been included in these plots. Note that even a coverage with 
/sky only 0.007 can punch out a robust detection from 50 to 150 in 
£, and though 0.07 loses out a bit (relatively) at 150, its detection 
would come from a wider stretch in ln^, out to £ ~ 20 before falling 
off. Only at f a ^ y > 0.25 does one begin to pick up the reionization 
bump. The curious drop in the all-sky N a jd B = 16 red line at the top 
is due to the Spider-like noise for higher £ being heavily enhanced 
because all of the sky is covered in the same amount of observing 
time. To illustrate the role of this, a CMBpol-like experiment 
with Cn decreased by ~ 1000 is plotted, with N a ^ e = 16 (dashed 
straight line) and N s ^ c = 64 (triple-dot-dashed line). The reason 
all three arc offset from one another is because the normalizing cr^ 
depends upon the amount the filter captures of the total r signal. 

posterior P(r|/ S k y , T>, T) as a function of / s ky Although 
we avoid explicit E-B decomposition with this method, 
we do make identical calculations to the TQU matrix 
ones in £-space using TT, TE, EE and BB, and assum- 
ing no mixing. We show that such a naive approach docs 
quite well in predicting the errors: if properly handled, 
polarization-mode- mixing is not a significant error source 
in most cases. Of course for cither method to be success- 
ful, all generalized noise sources need to be identified 
including instrumental leakage from T to Q,U. 

4.1. Calculation of Ensemble-Averaged Posteriors on 
Parameter Grids 

We calculate the posterior distribution on a grid- 
ded parameter space, a method mostly applicable to 
low dimensional parameter spaces. At each point of 
the parameter grid the Cxe's arc calculated using the 
public code CAMB Q These are then multiplied by 
beam windows, Bf = e~^ +1 ^"'> , assuming a Gaus- 

13 http://camb.info/ 



sian beam of width o\, = 0.425#fwhm, and pixcliza- 
tion windows e , an isotropized approximation to 
finite pixel size effects. (Timestream digitization fil- 
ters are also generally required, but are swamped by 
these two filters.) The product is used to construct 
the symmetric 3x3 theoretical pixel-pixel signal covari- 
ance matrices, with 6 independent sub-matrices, Csx, 
X C {TT,TQ,TU,QQ,QU,UU}. We assume exper- 
imental noise is Gaussian and usually take it to be 
white, so Cn.t = o~n yl for the temperature block and 
Cn,q,u = o-^ pol I for the polarization block of the covari- 

ancc matrix, where we usually have a n po i ~ y/2o- nT . 
Here the o~ n are effective noises per pixel, an amalgama- 
tion of the noises coming from different frequency chan- 
nels. I is the identity matrix. We neglect leakage from 
T to Q,U. 

Since we are forecasting the uncertainties in r from 
different experimental setups, and not analyzing actual 
CMB maps, we can bypass creating a large ensemble of 
simulated CMB maps by replacing the observed correla- 
tion matrix Cto by its ensemble average: 

TrCjT 1 AA f -> T^C^AA 1 ") = TrC^Cto- 

Here Cto is the ensemble-averaged "pixel-pair data", 
namely the covariance matrix of the input fiducial sig- 
nal model together with the instrument noise, and Ct(q) 
is the signal pixel-pixel covariance matrix for the param- 
eters q plus the various noise contributions, instrumental 
and otherwise. An advantage of this approach is that the 
recovered values of the parameters are what the ensem- 
ble average of sky realizations would yield, and will not 
move hugely due to the chance strangeness of any one 
realization (as the real sky may provide for us). Note 
that while sample variance does not impact the location 
of the maximum likelihood in this ensemble-averaged ap- 
proach, it is fully reflected in the width of the posterior 
distribution from which our uncertainties are derived. 

We mask out the part of G alaxy falling in the observed 
patch (the P06 WMAP-mask lPage et aLl ()2007l )). assum- 
ing it to be too foreground-dominated for useful param- 
eter extraction. We also project out modes larger than 
the fundamental mode of the observed patch since, due 
to time-domain filtering, information is not usually re- 
coverable on such large scales. For instance, if the mask 
has the shape of a spherical cap extending from the north 
pole to 6 = # P atch, we add a very large noise to the modes 
with 2£ + 1 < [2n/w] where [..] takes the integer part 
and zu — 2 sin(f? patc h/2) is the flat 2D radius of the disk 
with an area equal to the solid angle of the cap. This 
makes the likelihood insensitive to any information at 
and beyond the patch scale. This large scale mode cut is 
especially important to include for larger values of / s k y , 
where the low £ modes contribute significantly to r mea- 
surement through the reionization bump. In real large 
sky experiments it will not be easy to draw such modes 
from the maps. 

Our simulations cover two observational cases: an all- 
sky experiment with Planck-like white noise levels, and a 
partial sky experiment with Spider-like white noise levels, 
each with two frequency channels, assuming other fre- 
quencies are used for subtracting foregrounds. We have 
also made the simplifying assumption that in each ex- 
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TABLE 1 

Specifications of Spider-like, Planck-like and CMBPol (mid-cost) 
experiments for simulations. 



Experiment 


Freq 


FWHM 


num. of dct. 


AT a 


AT 


obs. time 




(GHz) 






1 


Q k,U 




Spider-like b 


96 


50' 


768 


3.2 


4.5 


580 hr 


Spider-like 


150 


32' 


960 


2.7 


3.8 


580 hr 


Planck-likc c 


100 


10' 


8 


3.8 


6.1 


2.5 yr 


Planck-like 


143 


V 


8 


2.4 


4.6 


2.5 yr 


CMBPol (mid-cost) d 


100 


8' 




0.18 


0.26 




CMBPol (mid-cost) 


150 


5' 




0.19 


0.27 





a nK, the instrument sensitivity divided by Vtotal observation time. 

These Spider-like specifications which are use d as the default in t his paper are dif- 
ferent from the more recently proposed ones in Fraissc et al. (2011) with two 20 day 
flights. The first flight uses three 90 and three 150 GHz receivers each with 288 and 
512 detectors respectively. In the second flight, two 280 GHz receivers replace one 
90 and one 150 GHz telescope, leaving the config uration of the flight identical to the 
first one. The detector sensitivity as proposed in Fraissc ct al. (2011]) is 150, 150 and 
380 £tKcMB\/s at 90, 150 and 280 GHz, respectively. The performance of the default 
Spider -like experiment in this pa per and the more recent proposal as in Fraissc ct al. 
Coo) are very close (see Figure 1120 , 
c http:/ /www.rssd.esa.int/index.php?projcct=planck 

d For a mid-cost full-sky CMBPol experiment based on table 13 of Baumann ct al. 
( 120091) . We are using 100 and 150 GHz channels in our simulations. Adding more 
channels, in the unrealistic case of no foreground contamination we simulate, would 
not affect the limits on r, since with these low instrument noise levels, either lensing 
or cosmic variance, depending on how small r is, would be the dominant source of 
uncertainty. 



periment, the FWHM of both channels is the same as 
the channel with the larger beam. This does not affect 
the results much due to the crude size of the pixelization 
and the absence of a gravitational wave signal at small 
scales. Sec Table Q] for other experimental assumptions. 

For the Spider-like case we keep the flight time constant 
so that the observation gets deeper as / s k y decreases, 
while for the Planck-like experiment the pixel noise is 
assumed constant for different values of / s ky The latter 
case, with small values of / s k y , is used to illustrate how 
well a strategy of only analyzing the lowest foreground 
sky could work, if for example, foreground removal turns 
out to be prohibitive over much of the sky. If foregrounds 
can be well-removed from Planck, then full sky is appro- 
priate. 

We calculate the constraints on targeted cosmological 
parameters for different /sky's, assuming the observed 
patches are spherical caps from 6 = to 8 = # pa tch, 
corresponding to 6 = cos _1 (l — 2/ s k y ). We perform the 
analysis at different resolutions for different sky cuts to 
minimize the effect of pixelization for small / s k y on the 
one hand, and to keep the computational time reason- 
able for large / s k y on the other hand. We use N s a c = 32, 
N sidc = 64 and N sidc = 128 for / sky > 0.25, 0.007 < 
/sk y < 0.25, and / s k y < 0.007, respectively. We checked 
the results for two neighbour resolutions at resolution 
switches. For the low / s k y switch, results are not sensi- 
tive to the change of resolution while for the switch at 
larger / s k y we are about 10% — 15% pessimistic in the 
results by choosing the lower resolution, specifically for a 
Planck-like case (with small beam) and for a higher value 
of r, e.g., r = 0.12. In these cases, lensing starts to dom- 
inate at higher multipoles and choosing a high enough 
resolution for the analysis would improve the errors on 
r by resolving the primordial gravity waves at relatively 
high multipoles. 



4.2. Residual Foreground- Subtraction "Noise" 

No study of gravitational wave detectability by B- 
mode experiments can ignore the impact of polarized 
foreground emission. Component separation is a major 
industry in itself. Various techniques have been utilized 
with CMB data up to now - often involving template 
parameter marginalization of one sort or another. We 
have been lucky so far in that the foregrounds have been 
manageable for TT, TE and EE. The level of subtrac- 
tion needed to unearth the very tiny gravity wave in- 
duced .B-signal is rather daunting, especially since the 
foregrounds are largest at the low £. Thus, although 
we may wrestle the generalized noise from the detec- 
tors and from experimental systematics to levels allow- 
ing small r to be detectable, the foregrounds will need to 
be well addressed before any claim of primordial detec- 
tion will be believable. Although we have learned much 
already about the TT foregrounds and, from WMAP, 
the synchrotron EE, we do not know the l- shape or the 
amplitu de of the polarization for dust. In 10 'Pea et al.l 
(|2011bl lal). the polarization emission from thermal dust is 
based on a three-dimensional model of dust density and 
two-component Galactic magnetic field. It is assumed 
that the degree of polarization has a quadratic depen- 
dence on the magnetic filed strength and its direction 
is perpendicular to the component of the local magnetic 
field in the plane of the sky, similar t o the model as- 
sumed by WMAP in iPage et "all (|2007t) . In forecasting 
for proposed post-Planck satellite experiments, simple 
approximations for thermal dust and synch r otron emis- 
sion have been made (e.g., iBaumann et al.l ()2009l ). and 
references therein). The dusty I -structure in this model 
is similar to the lO'Dea et al.1 (l2011bD form: C X f. ~ l~°- 5 
for X = EE, BB. We follow this IBaumann et all ()2009l ) 
approach here, but apply it to our pixel-based analysis. 

We therefore assume that the maps arc already 
foreground-subtracted, possibly with the wider Planck 
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TABLE 2 
Parameters of our assumed 
fore ground model, adopted 
from bauman n et al.i (|2009h . 



Parameters 


Synchrotron 


Dust 




4.7 X 10" 5 


1 


VO 


30 


94 


Co 


350 


10 


a 


-3 


2.2 


d E 


-2.6 


-2.5 


B 


-2.6 


-2.5 



frequency coverage used in conjunction with the Spider 
maps, with the CMB-component having a residual un- 
certainty, which we incorporate in our analysis as an 
additional large-scale (inhomogeneous) noise component 

. We assume the power spectrum of the foreground 
residuals has the same shape as the original foreground 
spectrum, but with only a few percent of the amplitude: 



Cxi -► C 



XI 



fg=S,D 



Jfg) r (fg) 
fc X ^Xt ' 



X = EE. BB, 



with the sum over synchrotron S and dust D emissions. 
The tunable removal-efficiency parameters e^ 8 -* are taken 
to be 5% in our plots. The shapes are: 

is 



fg , lit -f 1 ) / v 

synchrotron : C x [{v) = — As — 

2tt \i> 



dust:4>) = ^± 



p A D — 



2a i 



o hv /kT 



1 



, 2 



e hv/kT _ i 



The dust polarization fraction, p, is assumed to be 
arou nd 5%. The values for the other parameters taken 
from IBaumann et al.l (|2009D are listed in Tabled] They 
were chosen to give agreement with WMAP, DASI and 
IRAS observations (and the Planck sky model, which is 
based on these). Although this model provides only a 
rough guide to the impact incomplete foreground sub- 
traction will have on r-estimation, it does include the 
crucial large-scale dependence which differentiates it so 
much from the structure of the instrumental noise. 

A natural question when considering deep small sky 
observations is how many patches there are on the sky 
with low foregrounds so the requisite cleaning is at a 
minimum. The Pl anck Sky Model for the polarized fore- 
groun d emission dLeach et al.l 120081 : iDelabrouille et ah! 
120 111 ) is similar to the one we have adopted. Using a 
code developed by Mivillc-Deschenes, we have calculated 
for patches of radius R the pixel-averaged variance at 
pixel p, Cp ol . fg (j>, R) = ((P - P(< R)) 2 } of the polariza- 
tion intensity P = y/ Q 2 + U 2 about the patch-average 
P arising from the synchrotron and dust foregrounds. 
We compare this with the a 2 ol gw (p, R) we obtained for 
each patch in a single tensor-only primordial polarization 
realization (which is proportional to r 2 ). The patches 
are sorted in decreasing order of the "signal-to- noise" 
ratio a po i jgw {p, R)/a po i i { g {p, R). The next pixel on the 
list is included in a patch list if it has no overlap with 
the patches in the previously-determined higher signal- 



to-noise list. A patch is considered to be r-clean if this 
polarization signal-to-noise exceeds unity, a rather strong 
criterion. At 100 GHz, we found no "r=0.01"-clean 
patches, seven "r=0. 05" -clean patches and ten "r=0.1"- 
clean patches with / sky > 0.007 (R = 10°). There 
are one "r=0.05"-clean patch and two "r=0.1"-clean 
patches for / sky > 0.03 (R = 20°). At 150 GHz, we 
found no "r=0.05"-clean patches and one "r=0.1"-clean 
patch with / s k y > 0.007 but no r=0.1-clean patches for 
/sky > 0.03. 

The non-overlapping criterion is quite severe. Another 
measure of r-clcanlincss is to determine the fraction of 
sky with cr po i,gw(p, R)/& P oijg(p, R) above unity. The "r"- 
clean fraction is clearly ~ for those values of r and R 
with no corresponding clean patches (as stated above). 
Here only the non-zero values are reported. At 100 GHz, 
the "r=0.05" -clean fraction is - 0.14 (R = 10°) and the 
"r=0.1"-clean fraction is - 0.24 (i? = 10°); For both 
values of r, there is no appreciable decrease in the sky 
fraction by increasing the patch sizes to R = 20°. At 150 
GHz, the "r=0.1"-clean fraction is - 0.04 (R = 10°). It 
should be noted that as these sky fractions do not neces- 
sarily correspond to contiguous regions, the sky fraction 
of interest for small-sky B-mode experiments is in princi- 
ple smaller. The Planck Sky Model at the lower frequen- 
cies agrees with the (extrapolated) synchrotron emission 
from WMAP, but the higher frequency polarized dust 
emission really requires better observations, and awaits 
the release of the Planck mission results. 

4.3. Correlations of r with Other Cosmic Parameters 

Either detecting r or placing a tight upper bound 
is crucial for progress in inflation studies. Correla- 
tions of r with other parameters q a must be properly 
accounted for, since they are marginalized in the re- 
duction to the ID r-posterior. The relative impor- 
tance of the various q a is determined by calculating the 
posterior- averaged cross-correlations (SrSq a ) ^ , which de- 
pend upon the experimental configuration and its noise. 
Within the quadratic approximation for the posterior in- 
formation action, the correlations can be estimated from 
the inverse components, [F~ 1 ] r ' a , using the Fisher ma- 
trix equation®, with lensing as well as instrumental 
noise included in the generalized noise matrix. Small 
steps in the main parameters of the standard ACDM 
model (ln(ilb^ 2 ), ln(51 c /i 2 ), Ho, n s , r, r) from the fiducial 
WMAP 7 value£3 were taken to determine F by numer- 
ical differentiation. The scalar amplitude A s is treated 
as a normalization parameter here, so it is not included 
in the parameter list. We use two different fiducial val- 
ues for r, 0.2 and 0.01, and three values of / s k y , 0.007, 
0.07 and 0.75, for a Spider-like experiment. We use a 
Gaussian prior on all parameters q a but r with and 
w^ 1 = F prior given by the WMAP 7 best-fit parame- 
ters. We choose (Fp ri or)a^ = \wmap7^i which gives 
a weaker prior than the true WMAP7 results would give. 
In the quadratic approximation to the posterior informa- 
tion action, the total Fisher matrix is F t = F + F pr i or . 

The average deviation in r, (Sr\8q a ), and its variance, 
(A5rA5r\5q a ) , driven by given fluctuations in the other 

14 http:/ /lambda. gsfc.nasa.gov/product/map/dr4/params/ 
lcdm_sz_lcns_wmap7.cfm 
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TABLE 3 

<T r FROM THE FULL LIKELIHOOD COMPUTED ON A 2D r-T GRID 

(bottom) cf. ID, 2D and 6D Fisher determinations [F~ 1 ] rr 

USING PIXEL-SPACE MATRICES (MIDDLE) AND THE SIMPLIFIED £-SPACE 

sums, with rfl d = 0.12. This demonstrates that the use of 

REDUCED PARAMETER SPACES GIVES ROBUST RESULTS, INDEPENDENT 
OF CAP SIZES, HERE FOR / sky = 1,0.07,0.007. 



method 


param space 


N alde = 32 


N sidc = 64 


N aide = 128 






/sky = 1 


/sky = 0.07 


/ sky = 0.007 


Fisher 


1 param 


0.022 


0.018 


0.037 


^-space 


2 param 


0.023 


0.018 


0.037 




6 param 


0.025 


0.020 


0.037 


Fisher 


1 param 


0.022 


0.019 


0.034 


pixel-space 


2 param 


0.023 


0.019 


0.034 




6 param 


0.025 


0.020 


0.035 


grid- based 


2 param 


0.021 


0.018 


0.036 



parameters, Sq a , are 
8r = (Sr\Sq a ) = [F^ 



lir.a 



[F t ] aP 8qt 



((Sr 5rf\Sq a ) = [F i _1 ] rr - [Ff 1 ] r ' a [F t ] a p[Ff 1 ] r ^ . 

If we let only one 8q a at a time differ from zero, and 
normalize the deviations to their 1-sigma values, we can 
express the result in terms of a dimensionless measure 
p ra of the degree of correlation: 

p ra = [(5r\q a )/a r ]/[8q a /a a ] 

ljT.a ^ / ljrr j^^i— ljaa\ ^/^ 



p 2 , 

rra.) 



The variance is ((8r — 8r) 2 \Sq a ) « 0^(1 

For the full sky case, we find the largest p ra for 
and n s , with p rT and p rng both sa 0.25. For smaller 
sky coverage, the degeneracy between r and r disappears 
since the main constraints on r come from the large scale 
polarization, which small cut-sky cases are not sensitive 
to. The dominant correlations of r are with the matter 
density parameters Q, c h? and Slhh 2 , at the 0.1 — 0.2 level, 
a consequence of the gravitational lensing induced BB 
noise. Even in the 25% case for p, the constrained error 
diminishes only by 3%. 

Thus we should be able to safely estimate the error on 
r with all or none of the basic six parameters held fixed. 
We verified this explicitly by comparing the 2D uncer- 
tainties calculated from the full 2D r — r-grid with the 
full 6D uncertainties calculated from the inverse Fisher 
matrix, in ^-spacc and in pixel-pixel space, in Table |3l for 
different / s k y and at different resolutions, defined here by 
the value of -/V S idc- With all six parameters included, ay 
increases by only ~ 10% over the single r-marginalized 
ay, which justifies our exploration using a heavily trun- 
cated parameter space to determine the errors in r. 

4.4. Results in i — r Space 

In this section, we use r as well as r to make our 2D 
parameter space since it has a direct impact on the BB 
rcionization bump. We fix the overall Cg normalization 
for each parameter pair to the WMAP TT measurement 
at I = 220. This is equivalent to having A s as an ad- 
justable parameter. If not otherwise stated, lensing has 
been included in all of the following simulations with a 
fixed noise template, linearly scaled with ^4 S accordingly. 
Treating lensing in the noise covariancc completely takes 
into account its effect on sample variance. It may be 
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Fig. 4. — Uncertainty in measuring r for different sky cover- 
ages with Spider-like (top) and Planck-likc (bottom) experiments, 
with and without foregrounds (squares and triangles respectively), 
for the fiducial model rgj = 0.12. The solid lines are the results 
of £-space analysis (ignoring foregrounds). The analysis has been 
performed with different resolutions for different f s k y , ranging from 
iV S i,i e = 32 for full sky to N B ^ C = 128 for the smallest sky coverage. 
The /^j. refers to the sky coverage before applying the Galactic cut 
so for full sky /sky is effectively ~ 0.75. The dashed line is the 2ov 
if the full sky needs to be effectively considered as a combination 
of several smaller patches with the individual observed sky fraction 
being / sky and the total area of all patches equal to Galaxy-masked 
full sky. 

possible for it to be partly remov ed in the pa t ch us - 
ing delensing algorithms, (see e.g., iSmith et al.l (|2008f ) 
and references therein), leading to a reduced variance in 
the same way that we are treating a foreground residual. 
However, treating lensing as a noise source is a good as- 
sumption for our purposes here. 

The 2a r (/ s ky) plots in Figures 2] and [S] arc our main 
results. Shown are two fiducial models with r^d = 
0.12,0.001, both having r fid = 0.09. The / sky in the 
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Fig. 5. — Similar to figure [4] with r fid = 0.001. 



plots is the sky coverage before the Galaxy is masked. 
The Galaxy cut starts coming into the observed patch 

for patch ~ 40° • 

The results are compared to the expected error bars 
on r from a simplified ^-space analysis. Counting modes 
properly is a difficulty in the ^-space approximation for 
cut-skies. (This differs from the full pixel-pixel covari- 
ance matrix analysis in which all modes are naturally 
taken care of.) For the l-sp&ce approximation, we have 
taken the mode number to be the naive [/ s k y (2^ 4-1)] 
where [..] indicate the integer part. This imposes a low £- 
cut on the modes by demanding [/ s k y (2^4- 1)] > 1 which 
overrides the £-cut from the fundamental mode of the 
patch, 2£ + l = [27r/2sin(0 patch /2)], up to 9 m 30°. 

This £-space ov(/ s ky) is a lower bound since it ignores 
the mode mixing on the cut sky. Still, in the absence 
of systematic errors and for the simplified noise assumed 
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Fig. 6. — The curves show 2o> as a function of rgj obtained from 
the Fisher matrix in i and pixel-space for / s k y = 0.007 (top) and 
0.07 (bottom). The choices for the curves are meant to unravel the 
impact cosmic variance, lensing, instrument noise and mode mix- 
ing have on cr r . The symbols show errors from the full likelihood 
calculated on a gridded 2D parameter space, and agree nicely for 
both pixel-space (squares) and £-space (diamonds). 

here, the errors we find are near the true (matrix) values, 
as Figure [4] confirms for r^d =0.12. A similar measure- 
ment with rfid = 0.2 shows the same thing, though with 
a more-flattened curve for a r (f sky) for the Spider- like 
case and with foregrounds playing a smaller role. E — B 
mixing does not seem to be a serious impediment, at 
least down to J s k y ~ 0.01. For the Spider-like experi- 
ment, the error minimum is 2a r = 0.035 for = 0.12, 
at / s ky ~ 0.15, but the trough is broad. For 
the low rfid = 0.001, for which only an upper limit can 
be expected, Figure [5] shows the agreement in cv(/ s ky) 
between £-space and pixel-space is not quite as good, 
especially for / s k y ~ 0.25 — 0.5 for which considerable 
observation time is expended on the £ ~ 12 BB valley 
(sec Figure [2]) where there is little signal. The naive £- 
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Fig. 7. — l<x and 2a i — r contours with and without foregrounds for a Spider-like experiment with different sky cuts and for a Planck-like 
Galaxy-masked experiment with effective / s k y ~ 0.75. In the two right panels the contours for the combined Spider-like and Planck-like 
experiments are also plotted. The black plus signs denote the input rg^ = 0.12 and rg^ = 0.09. Expending Spider-like observing time on 
large sky coverage would not improve much the Planck forecasted r error, but would decrease the combined r error, suggesting the deep 
small-sky option is better. 



space approximation underestimates this, but agreement 
with pixel-space is regained in runs with the reionization 
bump removed, by setting r = 0; for this case the mono- 
tonic rise in oy(/sky) with increasing / s k y continues to 
full sky. 

Extending to the full Galaxy-masked sky improves the 
upper limit on r since the window function captures the 
low-£ bump. The £-space and pixel-space calculations 
disagree slightly, but when the Galaxy mask is removed, 
the estimates agree. 

At small /sky, 2 ay increases due to lensing which dom- 
inates the total BB spectrum at small scales. The com- 
petition between avoiding contamination by lensing and 
avoiding the i « 12 valley produces a weak minimum 
in a r at / s k y ~ 0.15 for r = 0.12, when a detection is 
expected, and at / s k y ~ 0.03 for r = 0.001, when an up- 
per limit is expected. The full sky is weakly optimal for 
setting an upper limit in the absence of foregrounds. 

The Planck-like measurements in the lower plots of 
Figures 2] and [5] show a rise in 2a r as / s k y drops wince 
the information on the large scales are lost while the 
pixel noise stays unchanged. The dashed lines in these 
plots show the approximate 2a r for a full-sky Galaxy- 
masked Planck-like experiment if the large-scale modes 
are filtered e.g., by time-domain filtering or due to high 
foreground contamination and thus the observed region 
is considered to be a combination of smaller patches 
(adding up to the full sky in total observed area). 

Not surprisingly, we see that foregrounds mostly af- 
fect experiments with larger / s ky, and for fiducial models 
with smaller r. We also see that deep observations of 
quite small patches seem to do as well as larger patches 
(observed less deeply) and even much better if r is small 
(for which the sample variance is very small and instru- 
ment noise plays the dominant role). 

Figure |6] shows how different components contribute to 
the error on r calculated using the Fisher matrix for var- 
ious rfid and /sky = 0.007 and 0.07. As before the mode 
mixing is ignored in the €-space calculation. If there were 
no lensing and no mode-mixing, in the limit of no instru- 
ment noise, the only source of error would be the sample 
variance, which is, as expected, proportional to r. The 
solid black lines show the minimum irreducible errors due 



to sample variance and lensing. We contrast this with 
calculations in both pixel and £-space of two Spider-like 
experiments. One has 10 times less noise than the fidu- 
cial Spider case, a noise level that can be seen to give 
almost no contribution to the errors for these sky cuts 
since lensing noise is dominant. The other has our stan- 
dard Spider-like noise, which can be seen to significantly 
add to the error. The neglect of mode-mixing in deter- 
mining a r vanishes as r increases, since sample variance 
dominates the error, as a comparison of the curves from 
the pixel-space and £-space analyses shows. The over- 
plotted symbols represent the errors from measuring the 
likelihood curve in a gridded 2D parameter space (as ex- 
plained earlier). The 2ay's from the full method and the 
Fisher matrix approximation are close. The small dif- 
ference is because the r-likclihood curve is not a perfect 
Gaussian. 

Figure [7] shows the 2D r-r contours for 3 different val- 
ues of sky coverage for a Spider-like experiment compared 
to a full-sky Planck-like experiment (with Galaxy mask 
cut) with and without foreground contamination. As 
expected, r is unconstrained as / s k y is decreases for the 
Spider-like experiment since r-constraints come from the 
largest angular scales: what is optimal for r detection is 
awful for t determination, for which all-sky is best. 

4.5. Results in i — n s Space 

In Figure [SJ we have plotted the r-n s contours for 
an / s ky = 0.08 Spider-like experiment and for a full- 
sky Planck-like survey, with and without foregrounds, 
using the model discussed in § 14.41 This shows almost no 
correlation between the two parameters for these exper- 
imental cases, as expected from the discussion in § 14.31 
It also shows the remarkable set of inflation constraints 
that may arise from Planck and Spider-like experiments. 

4.6. Results in r~n t Space 

Although detecting r would provide an invaluable mea- 
sure of the mean acceleration parameter (and energy 
scale) of inflation, we want more, the shape of the tensor 
power embodied in the tensor tilt n t , which we explore 
here in a 2D space by fixing r, n s and the other cosmic 
parameters. Figure [9] shows the 2D contours for r-n t 
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Fig. 8. — r-n s contours for a Spider-like /ju = 0.08 experi- 
ment using pixel-space simulations are contrasted with that from 
a Planck-like Galaxy-masked / s k y = 0.75 experiment. CosmoMC 
(http://cosmologist.info/cosmomc/) was used in the latter case to 
properly take into account the correlations of n s with other cos- 
mic parameters, which, unlike for r, are non-negligible. Top has 
r fid = 0.12 and bottom has 0.001; both have n s = 0.98. Apart 
from demonstrating the small (rn B ), the plots indicate a possibly 
very rosy picture for constraining these two critical inflation pa- 
rameters. 

with rga = 0.12, and fiducial tensor tilt n^ea = —0.0150 
satisfying the inflation consistency condition eq. [TJ Alas, 
we see that n t is hardly constrained by Spider-like and 
Planck-like experiments, no matter how large / s k y is. 
To see whether a post-Planck deep all-sky experiment 
could modify this conclusion comparison, we ran our 
analysis using the specificat ion of a putative mid -cost 
CMBPol mission outlined in lBaumann et ahl (|2009t) . us- 
ing the frequency channels described in Table [JJ There is 
of course improvement, and the COrE and PIXIE post- 
Planck missions would do better, but the relatively short 
A.£ ~ 150 baseline precludes even an ideal experiment 
from providing a powerful test of inflation consistency. 



4.7. Breaking r up into rxp-Shape Parameters: A 
Tensor Consistency Check 

Because r is essentially a linear parameter (for given 
A s ), we are effectively determining a single (very) broad- 
band power amplitude multiplying a collection of fiducial 

X-template shapes C^l given by the gravitational wave 
powers. It is natural to test this loeked-in monolithic pa- 
rameterization by introducing a collection of parameters 
rx/3 multiplying individual X and £-band templates: 



Ceei = C i EEl 



(10) 



(s) 

Here C EEi is the scalar part of Ceei, including lensing, 

and Cg ^ is the lensed BB power. The overall nor- 
malization is arranged so that rxp = r is the tensor 
consistency condition. The X/3W S are the /3- windows. 
These have often been taken to be top-hats satisfying a 
saturation property YlpXp(£) = 1 an d an orthogonality 
property Xp{£)x'p{£) = <W' m bandpower work. How- 
ever, the modes could also be quite overlapping as long 
as saturation and the rxp — T normalization are satis- 
fied. 

This is a reasonable path to finding the tensor band- 
powers for BB and EE but, given the § 14.61 result on nt, 
we will content ourselves with a 2D example using one 
^-band j3 and two X paramet ers, r B E and r E B- For this 
study, we keep A s fixed (cf. § 14.41 and !4.6|) . The contours 
in Figure [TU] show the degree to which the tensor consis- 
tency encoded in the r EE = rsB line, can be checked. 
The contours confirm the expectation that the £?-modcs 
are the most influential source of information about pri- 
mordial tensor perturbations, since the large scalar con- 
tribution to EE swamps the tiny tensor signal, inflating 
the error bars. Using checks like these for showing consis- 
tency have had a long history. In the first EE polariza- 
tion detection papers, the EE amplitude was shown to be 
consis tent with the amp l itude expecte d from TT param- 
eters (jKovac et al.ll2002t lSieversH2004D . In the first lens- 
ing detections in the TT power spectra, the deviations 
from lens-free results were shown ro be consistent with 
expectations from the parameters determined from the 
prim ary TT data dReichardt et alj|2009t iDunklev et all 
12011 . 

4.8. Breaking / s k y into Many Fields 

Using multiple (foreground-minimized) fields to make 
up a total / s ky is an approach that has been advocated 
for ground-based strategies (e.g., for ABS,E3)- I n Fig- 
ure [TT] we show the impact of splitting / s k y into four 
patches, while keeping the total integration time and the 
instrument noise constant. One does not lose that much 
as long as the total probe is a few percent of the sky, a 
consequence of the broad single-patch <7 r (/ s ky) minimum. 
The number of polarization-foreground-clean patches is 
of course still to be determined. We also varied the patch 
geometry; e.g., for an / s k y ~ 0.08 rectangular region with 
Hid = 0.12, we get 2oy = 0.048 without foregrounds, in 
good agreement with the cap result 2o> = 0.050. 

15 http: / /www. princeton.edu/physics / research / cosmology- 
experiment /abs-experiment / 
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Fig. 9. — lcr and 2a r-nt contours for a Spider-like experiment with different sky cuts and for a Planck-like Galaxy-masked / B ky = 0-75 
experiment. The contours for a CMBPol-likc experiment as well as those for the combined Planck-like and Spider-like experiments are 
plotted for comparison. The black line is the inflation consistency line and the black plus sign is the fiducial input, r = O.f 2 and nt = —0.015. 
Even with this CMBpol, inflation consistency is not that well tested. 




Fig. 10. — la and 2a contours in the tee-^bb plane for a Spider-like experiment with different sky cuts and for a Planck-like experiment 
with / s ky=0.75. The black solid lines show the tensor consistency curves tee = i"BB and the plus signs show the fiducial tee = I'BB = 0.12 
input model. As expected, rgg is better determined than tee an d this tensor consistency is not well tested. 



5. SUMMARY AND CONCLUSIONS 

In this paper, we applied a full matrix likelihood analy- 
sis to multifrequency Q-U polarization maps and T-maps 
of forecasted data to determine the posterior probability 
distribution of r. 

5.1. Leakage Levels and Leakage Avoidance 

This method avoids the explicit linear E-B decomposi- 
tion of the polarization maps before doing the likelihood 
analysis and gives the best possible determination of r, 
provided that systematic errors are correctly modelled. 
For realistic cut-sky observations, we measured the level 
of BB contamination from the inevitable mode-mixing 
from the much larger EE power. In addition, there is 
leakage from instrumental effects, in particular with T 
seeping into Q and U, which has to be included in any 
approach. We have left the investigation of this issue to 
future work. 

5.2. Computational Feasibility of Exact Likelihoods 

It is often the case in CMB cosmology that the shear 
number of pixels precludes a direct full map-based like- 
lihood procedure, with an intermediate power spectrum 
determination done before parameter estimation. How- 



ever, for Spider and similar ground and balloon ex- 
periments targeting r, relatively low resolution and re- 
stricted sky coverage are all that is really needed for de- 
tection. The result is a total pixel number that allows 
computationally feasible inverse and determinant calcu- 
lations of the large signal-plus-noise correlation matrices 
Ct — Cn + Cs(q) - with contributions from both the 
parameter-dependant signal covariance Cs(q) and the 
generalized noise Cn, which includes uncertainties from 
the foreground subtraction as well as from instrumental 
and systematic noise in the maps 

Matrix methods have had a long history, dating from 
the ea rliest CMB data sets, e.g., iBond &: Crittenden! 
(|20Qll ). For example, they were used for COBE, Saska- 
toon, Boomerang, and CBI analyses. Often compression 
was used, e.g., to signal-to - noise eigenmodes (|Bondlll995T : 
IBond fc Crittendenl l200l or by coarse-grained gridding 
( Myers et al.l 120031) . to make the matrix manipulations 
tractable. With Boomerang, an important aspect was to 
make sure all issues regarding data-filtering, inhomogc- 
neous and aspherical beams, transfer functions, striping 
etc. were properly included. Invariably, a Monte Carlo 
simulator of each experiment has been built, in which 
simulated timestreams have as many effects from system- 
atic and data processing as one can think of included. 
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Fig. 11. — When one patch covering / s i; y is broken up into four 
/sky/4 cap-patches, but the noise and observing time remain con- 
stant, the (r-marginalized) r-errors remain similar except at very 
small / s ]jy. We also show that factors of two changes in the noise 
swamp this effect. The calculations were done with = 0.12 
in the pixel-space except for the highest sky coverages where the 
pixel and £-space analysis are in excellent agreement. The effect of 
foreground contamination and Galaxy cut has not been taken into 
account here. 

5.3. Matrix Estimation from Monte Carlo Noise and 
Signal Simulations and Relation to Master /XFaster 

The Master/XFaster approach encodes this in 
isotropized £ space filters and rotationally symmetrized 
masks which allow one to relate the underlying all-sky 
Cs,cXi to the filtered cut sky. Similarly an isotropized 
noise Cn, Cj >« is also determined by taking processed noise 
timestreams, creating maps with them, Yg m transform- 
ing them, then forming a quadratic average over noise 
samples J s , C N , C X£ = J2j s , m \aNJ t ,cXlm\ 2 /[{1£ + 1)N S ]. 

When one has a large number of detectors, using only 
cross-correlations and no auto-correlations has an advan- 
tage, namely that the the cross-noise is small, from sys- 
tematic effects in the arrays and instrument as a whole. 
Precise modelling of the auto-noise is not easy. However, 
any operation that can be done for Master or XFaster can 
also be done to estimate the noise matrices, using noise 
sample sums. (Getting convergence of small off-diagonal 
components may require many samples). Matrices have 
the advantage that they naturally allow for anisotropic 
and inhomogeneous components, in the noise maps - in- 
cluding striping effects - and in the beam maps and in 
the foreground maps. There are issues about optimal es- 
timation of the generalized pixel-pixel matrices that one 
would like to tune, but there are no fundamental obsta- 
cles to making the Cn and Cs matrices highly accurate 
for parameter estimation. 

WMAP used a matrix-based likelihood for low £, con- 
nected to an isotropized £-space likelihood covering the 
high £'s. Planck is doing the same. We expect such a 
hybridized likelihood code will also be used for Spider- 



like experiments for routine parameter estimation, even 
though we think one can get away with a full matrix 
likelihood code. 

If simulated timestreams are used for Cn and Cs esti- 
mation, generalized pixels may prove preferable to the 
usual spatial pixels. The Cosmic Background Imager 
CBI (jMvers et al.ll2003tlSieversll2004) used the reciprocal 
space pixels for the primary construction, rather natural 
for an interferometry experiment where the timestream 
analog is a set of visibilities. ACT and QUaD also have 
done their power spectrum estimation in the Fourier 
transform space of spatial maps. 

5.4. The CBIpol Approach as a Guide for Small 
Deep-sky Analyses 

The use of matrix likelihood codes does not mean that 
E and B maps will not be constructed, just that pa- 
rameters would not be extracted from them. The CBI 
example of how such E, B maps were made and used, and 
why bandpower and parameter estimations did not use 
E, B maps serves as a paradigm for how things could pro- 
ceed for Spider-like data. The CBI data were compressed 
(via a GPJDR code) onto a discrete (reciprocal) lattice 
of wavenumbers by projecting measured interferometer 
visibilities onto a gridded 2D K-space. A direct unitary 
transformation takes such a basis of " momentum" modes 
into a basis of spatial modes in real space where Q-U is 
a more appropriate representation. An important point 
is that the polarization map estimators evaluated on the 
discrete wavenumbers of the lattice are linear combina- 
tions of the continuous wavenumbers, the mode-coupling 
of finite maps which also leads to an E-B mixing. 

In the lattice representation, the resulting size of the 
correlation matrices for CBI were quite tractable for di- 
rect inversion and the full likelihood was evaluated (via 
an mLikcly code) to determine bandpowers for TT, EE, 
BB and TE, without separation of the Fourier maps into 
E and B. 

An optimal linear map reconstruction of E and B 
was done for visualization purposes, with real-space 
and momentum-space maps showing the CBI E and 
B Wiener-filtered means, accompanied by a few maps 
showing typical fluctuation maps about the mean maps. 
These were contour maps, since the usual headless vector 
polarization plots are of length the polarization degree, 
V<3 2 + U 2 , tilted at an angle arctan([//Q)/2. 

For Spider-like bolometer-based experiments for which 
the raw data are bolometer time-streams from which 
QU maps are constructed, the compression step leads to 
tractable matrices as in the CBIpol case, although in the 
first instance the pixelization choice may be in real space 
rather than in wavenumber space or in a generalized-pixel 
space. Just as with CBIpol, parameters and bandpowers 
would be determined with direct likelihood calculations, 
yet Wiener-filtered EB maps would still be made for vi- 
sualization. 

5.5. Exact 2D Likelihood Computation 

Given the matrix construction method, we determined 
the posterior probabilities on reduced 2D-grids consist- 
ing of r and one other cosmic parameter, in many cases 
the Thomson scattering depth to rcionization r. The 
grid could be extended to higher dimensions, as they 
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were in early CMB analyses of COBE, Boomerang, CBI 
and ACBAR. More efficiently, MCMC chains could be 
used to explore the posterior probability surface. Since, 
as we have shown, r is relatively weakly correlated with 
the other standard cosmic parameters, our use of a re- 
duced dimensionality is accurate. We targeted r for a 
second parameter, although it too is weakly correlated 
for Spider-like experiments probing modest / s k y , because 
of its importance for the reionization bump in BB which 
is picked by large / s k y experiments such as Planck. We 
showed that as long as the input value rgd is reasonably 
larger than the error ay, e.g., ~ 0.1, rgd can be well- 
recovered by our methods. 

5.6. The Inflation and Tensor Consistency Checks 

We have used r and n t for our reduced 2D parameter 
space to see how well the inflation consistency condition, 
n t ps —r/8, can be tested. For example, with rgd = 0.12 
and the consistency value n^fid = —0.015 , we obtain 
2 ay ps 0.036 and 2<r„ t ps 0.28. The large 1-sigma error on 
n t is what one might have expected given the relatively 
small ^-baseline (reminiscent of the ±0.2 limit on n s from 
the even smaller baseline COBE DMR data). Thus, al- 
though breaking up r into bands will be useful, the n t 
slope that follows will be not be powerful enough to test 
consistency. With CMBpol and at iV s id e = 512, the er- 
rors are 2oy ps 0.014 and 2er„ t ps 0.07 , still too large. 
A more prosaic internal consistency check was done to 
show that what one thinks is r from the total BB agrees 
with what one gets from the less-tensor-sensitive total 
EE. 

5.7. Relation to Planck 

We based our Planck-like case on the Blue Book 
detector specifications. The actual in-flight perfor- 
mance is quite similar (Planck HFI Core Team et al.l 
[20111: iMennella et all 120 111) . It is encouraging that five 
full sky surveys of six months seems possible, as we near 
the end of the fourth. What will emerge from the actual 
Planck polarization analysis may be quite different from 
the simplified foreground-free 2a r (/ s k y = 0.75) ~ 0.015 
forecast of white experimental noise with well-subtracted 
foregrounds of known residual, and with no systematics. 
This relies on the BB reionization bump being picked 
up, but the required low £'s are especially susceptible to 
the foreground-subtraction residuals (2a r (/ s k y = 0.75) ~ 
0.05) and systematic effects . Som e of the issues are de- 
scribed in lEfstathiou et ahl ([20091 ) . Irrespective of how 
well Planck wrestles with the low £ issues, it will be 
able to analyze many patches within the 75% of the 
sky, rank-ordered by degree of foreground contamination. 
Although such a procedure would lose the reionization 
bump, robustness to foreground threshold variation of 
any r-dctection could be well-demonstrated. Apart from 
its many other virtues, Planck should be very good for 
this. 

5.8. Relation to Spider 

The same strategy of using many fields with the lowest 
foregrounds to make up the total / s k y may also prove use- 
ful for Spider-like experiments (such as the ground-based 
ABS). We showed that splitting / s k y into four patches 
with fixed integration time and the instrument noise re- 
sults in only a small loss in r-sensitivity because ay (/ s k y ) 



has a relatively wide single-patch minimum. How many 
polarization-forcground-clean patches there are is still to 
be determined. 

Although the specifications we chose for " Spider- like" 
was motivated by a bolometer array experiment feasi- 
ble with current technology, our forecasts should not be 
taken as realistic mocks of the true Spider which is un- 
der development, and for which a number of campaigns 
are envisaged (see the footnote under Spider-like in Ta- 
ble HJ. The techniques used here have, however, already 
been applied in Spider forecast papers using more real- 
istic statistically inhomogeneous noise, s canning strate- 
gies a nd ob servational du r ations , e.g., in iFilippini et ail 
pOlOt) and IFraisse et all poll . On an / skv ~ 0.1 
'"fid = 0.01 simulations, we compared lFraisse et al.l ([201 ID 
non-uniform noise modulated spatially by the scanning 
strategy's number-of-hits-per-pixel with uniform white 
noise with the same integrated noise power. Although 
the deviation in the standard deviation of the noise rms 
was about a factor of two times the mean noise rms, 
with largest impact near the scanning boundaries, we 
found very similar results for the posterior, showing this 
paper's conclusions are insensitive to our use of uniform 
white noise. (Of course the foreground noise radically al- 
ters the whiteness, and this of course has been included 
by us, but only in a statistically isotropic way — the 
Galactic latitude dependence breaks this isotropy just as 
the pixel hits do.) In § I4.4| we showed that in the ab- 
sence of foregrounds our Spider-like case could achieve 
2a r ps 0.02 over a broad range of / s k y - The r-posteriors 
shown in Figure rT2] were made with the numerical codes 
de scribed here, for th e Spider experiment as envisaged 
in IFraisse et~aT1 ([20 111 ) (labeled as "Spider" in the plot), 
and for an even more ambitious campaign of subsequent 
flights of the Spider instrument, as proposed for SCIP. 
We see that the performance of the experiment with 
Spider-like specifications used in this paper is very close 
to the actual Sp i der. A different foreground model used 
in IFraisse et al.l ([20111 ) for / s k y ~ 0.1 led to a similar 
~ 50% error degradation. 

5.9. History and Forecasts of r Constraints 

When the large angle CMB anisotropics were first de- 
tected with COBE DMR, the broad-band TT power am- 
plitude (£ < 20), with wavenumbers k" 1 > 1000 Mpc, 
was related to the linear density power spectrum am- 
plitude at the radically different ~ 6 Mpc scale, 
assuming a nearly scale-invariant primoridial spectrum: 
a 8 ps 0.85e-( T - ai 7\/l + 0.6r x 1°£ 6 for typical ACDM 
parameters popular in mid nineties, 51a ~ 2/3, h ~ 0.7 
(jBondl[T996| ). rather similar to the values now. Requir- 
ing ag > 0.7 to get reasonable cluster abundances at 
zero redshift - a venerable cosmological requirement from 
the 80s - gives a rough constraint on r from the COBE 
data in conjunction with large scale structure (LSS) data: 
2o~ r < 1 for current r values - but r only had an upper 
limit until WMAP1, with a more accurate determination 
waiting until WMAP3. 

The first 2003 WMAP1 constraint on r from TT and 
TE CMB-only data (with weak priors) was 2a r < 0.81, 
reducing to 2a r < 0.64 with the WMAP3 TT, TE and 
EE data, and other TT CMB data available in 2005. 
It decreased to 0.31 with the LSS data of the time 
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Fig. 12. — The r-likclihood curve for the Spider-like experiment 
(which is the default experiment used in this paper) with 
r fid = 0.001 and / s k y = 0.08 is contrasted with proposed stages 
in balloon-borne experimenting with an actual Spider focal plane. 
The one labeled as Spider corresponds to the actual, more recent 
Spider proposal with two flights described in IFraissc ct al] l|2011fl 
(see the footnotes of tablc[T]l. The SCIP envisages three subsequent 
nights of the Spider payload. We see that the future sensitivity 
may exceed this paper's forecasted constraints. These Spider 
likelihood curves have been contrasted with the current limit on r 
from CMB (ACT+WM AP7) alone and from CMB with measure- 
ments of H and BAO jDunklev et al.ll2uT0h . The marginalized 
ID likelihood curves are based on the publicly available chains 
http: / /lambda. gsfc.nasa.gov/product/act/act_chainsv2_get.cfm 
binned into 50 bins, and Gaussian-fitted to plot the very small 
r region where not enough points were available. These current 
and near future constraints are compared to the expectation form 
the next generation of space CMB mission. As an example, we 
used the results of simulations for a full sky CMBPol experiment 
(see table [TJ again with rg^ = 0.001, which gives 2a> ~ 0.0004, 
comparable to the forecasted errors from PIXIE 2cr r ~ 0.0004 and 
COrE 2o> ~ 0.0007. 

(|MacTavish et al1l2006l ). The most recent r-constraint 
from the low £ amplitude and shape of the TT and EE 
spectra from WMAP7+ACT is the upp er limit 2a r ~ 
0.25, reducing to 0.19 when LSS is added (jDunklev et al.l 

l-oil , 

To make a further leap awaits an effective BB mode 
constraint. As we have seen, Planck can give 0.015- 
0.05, Spider 0.014-0.02. The COrE satellite proposal 
(|The COrE Collaboration! 120111 ) suggests better than a 
3-sigma detection could be made for above 0.001 with 
bolometer arrays in space. The PIXIE satellite proposal 
(jKogut et al.l 120111 ) claims 2 ay ss 4 x 10~ 4 is achiev- 
able with Fourier Transform Spectrom etry. Applying 
our methods to CMBpol specifications (jBaumann et al.1 
I2009H we get 2a r ps 4 x 10~ 4 for r fid = 0.001 and 
2oy ps 1.2 x 10~ 4 for rga = 0.0001. If r fid is as large 
as 0.12, as in the simple m 2 <f) 2 chaotic inflation, we get 
2oy ps 0.015 (and 2<j nt ps 0.07 encompassing the consis- 
tency input of n t = —0.015). For a noiseless all-sky ex- 
periment, hence with errors from cosmic variance only, 



we get 2a r ss 10~ 4 for A S id c = 128 for tiny rfid- It is 
unclear at this time how much inexact foreground sub- 
traction and lensing noise will limit r determinations in 
these ideal cases. 



5.10. The ID Shannon Entropy of r 

We have described another way to cast the improve- 
ments expected in r-cstimation as experiments attain 
higher and higher sensitivity, the marginalized ID Shan- 
non entropy AS'if(r) for r. This measures the (phase- 
space) volume of r-space that the measurement allows. 
It is obtained by direct integration over the normalized 
ID likelihood for r, with all non-Gaussian features in the 
likelihood properly included. We have found in prac- 
tice that ASif(r) ps Aln[ay\/2~7r]j with ay determined by 
the forced Gaussianization described in the paper, works 
quite well, so in a way we are just restating the error 
improvements in the information theoretic language of 
bits. 

We use the WMAP7+A CT TT,TE and EE 2oy ~ 
0.25 (jDunklev et al.l I2010D constraint for our baseline. 
The first WMAP1 constraint in 2003 (jSpergel et al.l 
120031 ). with ASif(r) = 1.70 bits had, of course, higher 
information entropy. Here, as in the abstract, we have 
translat ed from nats to bits The recent WMAP7+SPT 
results (jKeisler et al.l 120111 ) with 2oy ~ 0.21 give a 
slight decrease in the entropy (ASif(r) = —0.25) com- 
pared to the baseline. The asymptotic perfect noise- 
less all-sky experiment gives (the somewhat r-dependent) 
ASif(r) ss —11 bits, the limit on obtainable knowl- 
edge from the CMB. The proposed post-Planck COrE, 
PIXIE and CMBPol-like experiments claim up to -9 
bits. For the Spider-like experiments forecasted here, the 
foreground-free decrease is -4.2 bits (and -3.6 bits with 
a 95% effective component separation). Thus balloon- 
borne and ground-based experiments with large arrays 
making deep surveys focussing on a relatively clean few- 
percent of the sky yield tensor information at least com- 
parable to shallow and wide surveys and are a powerful 
step towards a near-perfect deep and wide satellite fu- 
ture. 

We would like to thank our many Spider, ABS and 
Planck collaborators for many stimulating discussions 
about the experimental assault on CMB tensor mode 
detection. We would like to thank William C. Jones 
for his helpful comments on the text. We thank Marc 
Antoinc Miville Deschenes for advice and aid on fore- 
grounds. Support from NSERC, the Canadian Institute 
for Advanced Research, and the Canadian Space Agency 
(for PlanckHFI and Spider work) is gratefully acknowl- 
edged. Part of the research described in this paper was 
carried out at the Jet Propulsion Laboratory, California 
Institute of Technology, under a contract with the Na- 
tional Aeronautics and Space Administration. The large 
matrix computations were performed using the SciNET 
facility at the University of Toronto. Some of the results 
in th is paper have been derived using the HEALPix pack- 
age (jGorski et al.H2005l ). http://healpix.jpl.nasa.gov. 



Gravitational Wave Detectability with the CMB 



21 



REFERENCES 



Baumann, D., ct al. 2009, in American Institute of Physics 
Conference Series, Vol. 1141, American Institute of Physics 
Conference Series, ed. S. Dodelson, D. Baumann, A. Cooray, 
J. Dunkley, A. Fraisse, M. G. Jackson, A. Kogut, L. Krauss, 
M. Zaldarriaga, &; K. Smith , 10-120 

Bond, J. R. 1995, Physical Review Letters, 74, 4369 

Bond, J. R. 1996, in Cosmology and Large Scale Structure, ed. 
R. Schacffcr, J. Silk, M. Spiro, & J. Zinn-Justin, 469-+ 

Bond, J. R., & Crittenden, R. 2001, in NATO ASIC Proc. 565: 
Structure Formation in the Universe, ed. R. G. Crittenden & 
N. G. Turok, 241-+ 

Bond, J. R., Jaffe, A. H., & Knox, L. 1998, Phys. Rev. D, 57, 2117 

Bunn, E. F. 2002, Phys. Rev. D, 65, 043003 

— . 2011, Phys. Rev. D, 83, 083003 

Bunn, E. F., Zaldarriaga, M., Tegmark, M., & de Oliveira-Costa, 

A. 2003, Phys. Rev. D, 67, 023501 
Challinor, A., & Chon, G. 2005, MNRAS, 360, 509 
Chon, G., Challinor, A., Prunet, S., Hivon, E., & Szapudi, I. 

2004, MNRAS, 350, 914 
Chuss, D. T., et al. 2010, in Society of Photo-Optical 

Instrumentation Engineers (SPIE) Conference Series, Vol. 7741, 

Society of Photo-Optical Instrumentation Engineers (SPIE) 

Conference Series 
Contaldi, C. R., et al. 2010, in prep, 
de Bernardis, P., ct al. 2000, Nature, 404, 955 
Delabrouille, J., ct al, & et al. 2011, in prep. 
Dunkley, J., et al. 2010, ArXiv e-prints 

Efstathiou, G., Gratton, S., & Paci, F. 2009, MNRAS, 397, 1355 
Filippini, J. P., et al. 2010, in Society of Photo-Optical 

Instrumentation Engineers (SPIE) Conference Series, Vol. 7741, 
Society of Photo-Optical Instrumentation Engineers (SPIE) 
Conference Series 
Fraisse, A. A., et al. 2011, ArXiv e-prints 

Gorski, K. M., Hivon, E., Banday, A. J., Wandelt, B. D., Hansen, 
F. K., Reinecke, M., & Bartelmann, M. 2005, ApJ, 622, 759 

Grain, J., Tristram, M., & Stompor, R. 2009, Phys. Rev. D, 79, 
123515 

Hansen, F. K., & Gorski, K. M. 2003, MNRAS, 343, 559 
Hivon, E., Gorski, K. M., Netterfield, C. B., Crill, B. P., Prunet, 

S., & Hansen, F. 2002, ApJ, 567, 2 
Kamionkowski, M., Kosowsky, A., & Stebbins, A. 1997, 

Phys. Rev. D, 55, 7368 
Keisler, R., et al. 2011, ArXiv e-prints 
Kogut, A., et al. 2011, ArXiv e-prints 



Kovac, J. M., Leitch, E. M., Pryke, C, Carlstrom, J. E., 

Halverson, N. W., & Holzapfel, W. L. 2002, Nature, 420, 772 
Lange, A. E., et al. 2001, Phys. Rev. D, 63, 042001 
Leach, S. M., et al. 2008, A&A, 491, 597 

Lewis, A., Challinor, A., & Turok, N. 2002, Phys. Rev. D, 65, 
023505 

MacKay, D. J. 2003, Information Theory, Inference, and Learning 
Algorithms, 1st edn. (Cambridge: Cambridge University Press) 
MacTavish, C. J., et al. 2006, ApJ, 647, 799 
Mennella, A., et al. 2011, ArXiv e-prints 
Montroy, T. E., et al. 2006, ApJ, 647, 813 
Myers, S. T., et al. 2003, ApJ, 591, 575 

O'Dea, D. T., Clark, C. N., Contaldi, C. R., & MacTavish, C. J. 

2011a, ArXiv e-prints 
O'Dea, D. T., et al. 2011b, ArXiv e-prints 
Page, L., et al. 2007, ApJS, 170, 335 
Piacentini, F., et al. 2006, ApJ, 647, 833 
Planck HFI Core Team et al. 2011, ArXiv e-prints 
Reeves, R. A., Bustos, R., Torres, S., & Readhead, A. 2006, in 

Revista Mexicana de Astronomia y Astrofisica, vol. 27, Vol. 26, 

Revista Mexicana de Astronomia y Astrofisica Conference 

Series, 121-122 
Reichardt, C. L., et al. 2009, ApJ, 694, 1200 
Rocha, G., Contaldi, C. R., Colombo, L. P. L., Bond, J. R., 

Gorski, K. M., & Lawrence, C. R. 2010, ArXiv c-prints 
Ruhl, J. E., ct al. 2003, ApJ, 599, 786 
Sheehy, C. D., et al. 2011, ArXiv e-prints 

Sievers, J. L. 2004, PhD thesis, California Institute of Technology, 

California, USA 
Sievers, J. L., et al. 2007, ApJ, 660, 976 
Smith, K. M. 2006, Phys. Rev. D, 74, 083002 
Smith, K. M., & Zaldarriaga, M. 2007, Phys. Rev. D, 76, 043001 
Smith, K. M., et al. 2008, ArXiv e-prints 
Spergel, D. N., et al. 2003, ApJS, 148, 175 

Szapudi, I., Prunet, S., &; Colombi, S. 2001, ArXiv Astrophysics 
c-prints 

Tegmark, M., & de Oliveira-Costa, A. 2001, Phys. Rev. D, 64, 
063001 

The COrE Collaboration. 2011, ArXiv e-prints 

Wraith, D., Kilbinger, M., Benabed, K., Cappe, O., Cardoso, J., 

Fort, G., Prunet, S., & Robert, C. P. 2009, Phys. Rev. D, 80, 

023507 

Zaldarriaga, M., & Seljak, U. 1997, Phys. Rev. D, 55, 1830 



